Main content area

The use of random forests in modelling short-term air pollution effects based on traffic and meteorological conditions: A case study in Wrocław

Kamińska, Joanna A.
Journal of environmental management 2018 v.217 pp. 164-174
air pollution, air quality, atmospheric pressure, case studies, climatic factors, models, nitrogen content, nitrogen dioxide, particulates, pollutants, relative humidity, summer, temperature, traffic, warm season, wind direction, wind speed
Random forests, an advanced data mining method, are used here to model the regression relationships between concentrations of the pollutants NO2, NOx and PM2.5, and nine variables describing meteorological conditions, temporal conditions and traffic flow. The study was based on hourly values of wind speed, wind direction, temperature, air pressure and relative humidity, temporal variables, and finally traffic flow, in the two years 2015 and 2016. An air quality measurement station was selected on a main road, located a short distance (40 m) from a large intersection equipped with a traffic flow measurement system. Nine different time subsets were defined, based among other things on the climatic conditions in Wrocław. An analysis was made of the fit of models created for those subsets, and of the importance of the predictors. Both the fit and the importance of particular predictors were found to be dependent on season. The best fit was obtained for models created for the six-month warm season (April–September) and for the summer season (June–August). The most important explanatory variable in the models of concentrations of nitrogen oxides was traffic flow, while in the case of PM2.5 the most important were meteorological conditions, in particular temperature, wind speed and wind direction. Temporal variables (except for month in the case of PM2.5) were found to have no significant effect on the concentrations of the studied pollutants.