Main content area

Flash flood susceptibility modeling using an optimized fuzzy rule based feature selection technique and tree based ensemble methods

Bui, Dieu Tien, Tsangaratos, Paraskevas, Ngo, Phuong-Thao Thi, Pham, Tien Dat, Pham, Binh Thai
The Science of the total environment 2019 v.668 pp. 1038-1054
algorithms, case studies, floods, models, prediction, Vietnam
The main objective of the present study was to provide a novel methodological approach for flash flood susceptibility modeling based on a feature selection method (FSM) and tree based ensemble methods. The FSM, used a fuzzy rule based algorithm FURIA, as attribute evaluator, whereas GA were used as the search method, in order to obtain optimal set of variables used in flood susceptibility modeling assessments. The novel FURIA-GA was combined with LogitBoost, Bagging and AdaBoost ensemble algorithms. The performance of the developed methodology was evaluated at the Bao Yen district and the Bac Ha district of Lao Cai Province in the Northeast region of Vietnam. For the case study, 654 floods and twelve geo-environmental variables were used. The predictive performance of each model was estimated through the calculation of the classification accuracy, the sensitivity, the specificity, the success and predictive rate curve and the area under the curves (AUC). The FURIA-GA FSM compared to a conventional rule based method gave more accurate predictive results. Also, the FURIA-GA based models, presented higher learning and predictive ability compared to the ensemble models that had not undergone a FSM. Based on the predictive classification accuracy, FURIA-GA-Bagging (93.37%) outperformed FURIA-GA-LogitBoost (92.35%) and FURIA-GA-AdaBoost (89.03%). FURIA-GA-Bagging showed also the highest sensitivity (96.94%) and specificity (89.80%). On the other hand, the FURIA-GA-LogitBoost showed the lowest percentage in very high susceptible zone and the highest relative flash-flood density, whereas the FURIA-GA-AdaBoost achieved the highest prediction AUC value (0.9740), based on the prediction rate curve, followed by FURIA-GA-Bagging (0.9566), and FURIA-GA-LogitBoost (0.8955). It can be concluded that the usage of different statistical metrics, provides different outcomes concerning the best prediction model, which mainly could be attributed to sites specific settings. The proposed models could be considered as a novel alternative investigation tools appropriate for flash flood susceptibility mapping.