ACTA Scientiarum Naturalium Universitatis Pekinensis
Improving Air Quality Forecast Accuracy in Urumqi-changji-shihezi Region Using an Ensemble Deep Learning Approach
ZHANG Bin1, LÜ Baolei2, WANG Xinlu3, ZHANG Wenxian3,†, HU Yongtao4
1. Xinjiang Bingtuan Environmental Protection Sciences Research Institute, Urumqi 830002; 2. Huayun Sounding Meteorological Technology Company, Ltd., Beijing 102299; 3. Hangzhou Aima Technologies, Hangzhou 311121; 4. School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332; † Corresponding author, E-mail: pkuzhangwx@gmail.com
Abstract A post-correction framework based on raw forecasts from the numerical air quality model CMAQ is implemented in the Urumqi-changji-shihezi region of Xinjiang Autonomous Region to achieve better forecasting performance of PM2.5. An ensemble deep learning method is used to correct the error of original forecasts of CMAQ. The method integrates four machine learning models: deep neural network model, random forest model, gradient boosting model and generalized linear model. In each model, the original meteorological forecasts, air quality forecasts and land use types are used as input data. With the independent evaluation data in 2018, the accuracy of the “bias-corrected” forecasts is significantly improved. The R2 values of the 5-day forecast is 0.41– 0.60, which are improved from the original forecasts by 60%–160%, while the RMSE values are reduced by ~40%. As for the cross evaluation, the R2 values of post-corrected results increase by 50%–80%, while RMSE values are reduced by ~30%. The post-correction method is computationally efficient and can be deployed operationally for reliable daily forecasting. Key words objective correction; multi-source data; machine learning; ensemble forecast