Jump to Main Content
Exploring prediction uncertainty of spatial data in geostatistical and machine learning approaches
- Fouedjio, Francky, Klump, Jens
- Environmental earth sciences 2019 v.78 no.1 pp. 38
- artificial intelligence, forests, geostatistics, kriging, models, prediction, regression analysis, spatial data, uncertainty
- Geostatistical methods such as kriging with external drift (KED) as well as machine learning techniques such as quantile regression forest (QRF) have been extensively used for the modeling and prediction of spatially distributed continuous variables when auxiliary information is available everywhere within the region under study. In addition to providing predictions, both methods are able to deliver a quantification of the uncertainty associated with the prediction. In this paper, kriging with external drift and quantile regression forest are compared with respect to their ability to deliver reliable predictions and prediction uncertainties of spatial data. The comparison is carried out through both synthetic and real-world spatial data. The results indicate that the superiority of KED over QRF can be expected when there is a linear relationship between the variable of interest and auxiliary variables, and the variable of interest shows a strong or weak spatial correlation. In other hand, the superiority of QRF over KED can be expected when there is a non-linear relationship between the variable of interest and auxiliary variables, and the variable of interest exhibits a weak spatial correlation. Moreover, when there is a non-linear relationship between the variable of interest and auxiliary variables, and the variable of interest shows a strong spatial correlation, one can expect QRF outperforms KED in terms of prediction accuracy but not in terms of prediction uncertainty accuracy.