Main content area

A soil moisture estimation framework based on the CART algorithm and its application in China

Han, Jiaqi, Mao, Kebiao, Xu, Tongren, Guo, Jingpeng, Zuo, Zhiyuan, Gao, Chunyu
Journal of hydrology 2018 v.563 pp. 65-75
algorithms, artificial intelligence, correlation, data collection, drought, drying, humid zones, models, prediction, regression analysis, semiarid zones, soil properties, soil water, summer, temperature, vegetation, China
Soil moisture is an important parameter associated with the land-atmosphere interface and is highly influenced by multiple factors. Previous studies have provided an effective mechanism for accurately estimating soil moisture by building a global estimation model that comprehensively integrates multiple factors at a local scale. However, a global model is inefficient for accurately estimating soil moisture at a large or even global scale because of the complex surface features that make it difficult to fit data globally. Furthermore, inconsistencies in the spatial integrity between multisource data and the mismatch between the training space and application space decrease the generalizability of the model, which may lead to unreasonable soil moisture values in certain areas. This study proposes a “pyramid” framework that integrates multiple factors from different sources using the classification and regression tree (CART) algorithm, a machine learning method, to estimate soil moisture at a high spatial resolution (1 km). The framework considers soil moisture as a response variable and several factors, such as precipitation, soil properties, and temperature, as explanatory variables. The framework uses piecewise fitting instead of global fitting and avoids the generation of unreasonable values. A k-fold cross-validation approach using “hold-out” years was used to assess the performance of the soil moisture estimation framework for the summer period. The results show that the performance of the framework was relatively stable during the study period with low variabilities in the r values (1 STD < 0.06) and error measures (1 STD < 0.05). The results predicted based on the framework are more accurate than the temperature vegetation drought index (TVDI) results. The correlation coefficients between the TVDI and soil moisture observations in June, July and August were 0.49, 0.29 and 0.49, respectively, whereas those between the predictions and observations were 0.70, 0.68 and 0.69, respectively, which reflected increases of 0.21, 0.39 and 0.20, respectively. The spatiotemporal analysis of summer soil moisture from 2000 to 2014 exhibited a significant wetting trend; the spatial patterns were characterized by wetting trends over arid and humid regions and drying trends over semi-arid regions. The results indicate that the “pyramid” framework can provide a soil moisture dataset with reasonable accuracy and high spatial resolution.