week 6

This week my reasearch project was solidified. I am going to be working on basic machine learning algorithms to help classify really narrow water streamlines based on geospatial data. I started working on Dr. Jiang’s training lab exercises. The exercises essentially helped walk me through the data preperation pipeline. First gettting the geospatial information about a piece of land in the form of raster data. Then buffering the stream data, because real life streams have a width and aren’t just thin lines, and storing it as a stream raster where each pixel marks whether there is a stream or not. This will help align the stream classification data with the geospatial data for the supervised learning step. Then we tabulate all this data using pandas geodataframe as this is a format that most machine learning models accept.

Note: Tabulating the data causes each pixel to be treated as an individual component. In real life one pixel being a stream or not is very telling of whether the neighbouring pixels are a part of the stream or not. So, this is an immediate shortcoming that can be spotted in the approach that we are taking.

Written on June 27, 2021

click here to go back.