Introduction
The goal of this vignette is to illustrate how FOSTER can be used to impute ALS-derived forest variables (response variables Y) to a larger area covered by multispectral satellite imagery and topographic data (predictor variables X). We can usually describe an imputation problem by defining two sets of observations: the reference and the target observations. At reference observations, both Y and X variables are defined while only X variables are available at targets. Ultimately, targets are the area where we want to impute response variables.
FOSTER has been designed around the following workflow:
- Data preprocessing to match the extent and spatial resolution of input data, mask cells that won’t be included in the analysis or perform spatial filtering to smooth the data before main processing
- Calculate spectral indices from multispectral data and summarize time series of spectral indices
- Perform a stratified random sampling to select cells that will be used to train and assess the accuracy of the k-NN model
- Divide the stratified random sample into training and validation sets
- Train a k-NN model from the training set and assess its accuracy with the validation set
- Impute response variables from sampled reference observations to the targets