Main content area

Guidance on and comparison of machine learning classifiers for Landsat-based land cover and land use mapping

Shih, Hsiao-chien, Stow, Douglas A., Tsai, Yu Hsin
International journal of remote sensing 2019 v.40 no.4 pp. 1248-1274
Landsat, computer software, data collection, decision support systems, neural networks, remote sensing, support vector machines, China, Ghana
Remote sensing scientists are increasingly adopting machine learning classifiers for land cover and land use (LCLU) mapping, but model selection, a critical step of the machine learning classification, has usually been ignored in the past research. In this paper, step-by-step guidance (for classifier training, model selection, and map production) with supervised learning model selection is first provided. Then, model selection is exhaustively applied to different machine learning (e.g. Artificial Neural Network (ANN), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF)) classifiers to identify optimal polynomial degree of input features (d) and hyperparameters with Landsat imagery of a study region in China and Ghana. We evaluated the map accuracy and computing time associated with different versions of machine learning classification software (i.e. ArcMap, ENVI, TerrSet, and R). The optimal classifiers and their associated polynomial degree of input features and hyperparameters vary for the two image datasets that were tested. The optimum combination of d and hyperparameters for each type of classifier was used across software packages, but some classifiers (i.e. ENVI and TerrSet ANN) were customized due to the constraints of software packages. The LCLU map derived from ENVI SVM has the highest overall accuracy (72.6%) for the Ghana dataset, while the LCLU map derived from R DT has the highest overall accuracy (48.0%) for the FNNR dataset. All LCLU maps for the Ghana dataset are more accurate compared to those from the China dataset, likely due to more limited and uncertain training data for the China (FNNR) dataset. For the Ghana dataset, LCLU maps derived from tree-based classifiers (ArcMap RF, TerrSet DT, and R RF) routines are accurate, but these maps have artefacts resulting from model overfitting problems.