Main content area

Performance of median kriging with robust estimators of the variogram in outlier identification and spatial prediction for soil pollution at a field scale

Sun, Xiao-Lin, Wu, Yun-Jin, Zhang, Chaosheng, Wang, Hui-Li
The Science of the total environment 2019 v.666 pp. 902-914
case studies, geostatistics, kriging, lead, prediction, soil, soil pollution, spatial data, China
Median kriging with robust estimators of the variogram has been proposed in literature to reduce the influences of outliers in spatial data of soil pollution, because median kriging can utilize outliers in spatial prediction and robust estimators can overcome the bias caused by outliers. However, performance of the method at a field scale remains unknown. This study compared the method in two case studies of soil Pb pollution with two other commonly used methods for outlier identification, including box-plot and standardized kriging prediction error (SKPE), and with two classical geostatistical approaches for spatial prediction, including kriging with and without outliers. One case was based on data with 359 samples collected in an area of 14.5km2 in Jura, Swiss. The other was based on data with 242 samples collected in an area of 2.8km2 in Zhuzhou, China. Results showed that the method identified both global and local outliers, while the method did not identify all global outliers based on the box-plot. For the Jura data which were more seriously affected by outliers than the Zhuzhou data, the method identified 49 outliers, sharing 39 with SKPE which identified a total of 46 outliers. For the Zhuzhou data, the method found just three outliers, much fewer than the 12 outliers identified by SKPE. In the case of Jura, kriging prediction with outliers winsorized by the method was negligibly more accurate than prediction without outliers identified by SKPE, e.g., 0.15% in terms of root mean square error (RMSE). However, in the case of Zhuzhou, the former prediction was slightly less accurate than the latter, e.g., 2.39% in terms of RMSE. This study suggested that the method performed well for data which were seriously affected by outliers, but not so well for data slightly affected by outliers.