An integrated prediction model of PM2.5 concentration based on TPE-XGBOOST and LassoLars

WENG Kerui, LIU Miao, LIU Qian

Systems Engineering - Theory & Practice ›› 2020, Vol. 40 ›› Issue (3) : 748-760.

PDF(1507 KB)
PDF(1507 KB)
Systems Engineering - Theory & Practice ›› 2020, Vol. 40 ›› Issue (3) : 748-760. DOI: 10.12011/1000-6788-2018-2060-13

An integrated prediction model of PM2.5 concentration based on TPE-XGBOOST and LassoLars

  • WENG Kerui, LIU Miao, LIU Qian
Author information +
History +

Abstract

Because of serious atmospheric pollution, the early warning and forecasting of PM2.5 concentration is particularly important. Due to the high complexity and randomness of the time series of PM2.5 concentration, the traditional integrated PM2.5 concentration decomposition prediction method does not take the air quality and meteorological factors into account. Thus, it is difficult to predict the PM2.5 concentration accurately only by the historical value. By decomposing the historical data, this paper introduced the TPE-XGBOOST model for high-frequency data and LassoLars model for low-frequency data, combined air quality and meteorological factors to reflect the variation trend of decomposition characteristics, and made prediction for the time series of PM2.5 concentration. Through the experiment, the model shows good prediction effect, and has higher prediction accuracy compared with the single decomposition integrated prediction model.

Key words

PM2.5 concentration predicting / EEMD / Bayesian optimization / XGBOOST / Lasso / multivariate factors

Cite this article

Download Citations
WENG Kerui , LIU Miao , LIU Qian. An integrated prediction model of PM2.5 concentration based on TPE-XGBOOST and LassoLars. Systems Engineering - Theory & Practice, 2020, 40(3): 748-760 https://doi.org/10.12011/1000-6788-2018-2060-13

References

[1] Zigler C M, Choirat C, Dominici F. Impact of national ambient air quality standards nonattainment designations on particulate pollution and health[J]. Epidemiology, 2017, 29(2):165-174.
[2] 严宙宁, 牟敬锋, 赵星,等. 基于ARIMA模型的深圳市大气PM2.5浓度时间序列预测分析[J]. 现代预防医学, 2018, 45(2):220-224.Yan Z N, Mou J F, Zhao X, et al. The time series prediction of PM2.5 in Shenzhen based on ARIMA model[J]. Modern Preventive Medicine, 2018, 45(2):220-224.
[3] 余辉, 袁晶, 于旭耀,等. 基于ARMAX的PM2.5小时跟踪预测模型[J]. 天津大学学报(自然科学与工程技术版), 2017, 50(1):105-112.Yu H, Yuan J, Yu X Y, et al. Tracking prediction model for PM2.5 hourly concentration based on ARMAX[J]. Journal of Tianjin University (Science and Technology), 2017, 50(1):105-112.
[4] 王治和, 陈向宏, 张强. 单因变量PLS模型在PM2.5实时浓度预测中的应用[J]. 计算机仿真, 2017, 34(10):387-392. Wang Z H, Chen X H, Zhang Q. PLS model of single dependent variable applied in predicting PM2.5 real-time concentration[J]. Computer Simulation, 2017, 34(10):387-392.
[5] 张浩, 于君毅, 刘晓慧, 等. 基于广义马尔可夫模型的PM2.5浓度预测[J]. 化工学报, 2018, 69(3):1215-1220. Zhang H, Yu J Y, Liu X H, et al. Prediction of fine particulate matter concentrations based on generalized hidden Markov model[J]. Journal of Chemical Industry and Engineering, 2018, 69(3):1215-1220.
[6] 朱亚杰, 李琦, 侯俊雄, 等. 运用贝叶斯方法的PM2.5浓度时空建模与预测[J]. 测绘科学, 2016, 41(2):44-48. Zhu Y J, Li Q, Hou J X, et al. Spatio-temporal modeling and prediction of PM2.5 concentration based on Bayesian method[J]. Science of Surveying and Mapping, 2016, 41(2):44-48.
[7] 秦侠, 雷蕾, 姚小丽. 大气污染预测中提高BP网络泛化能力的方法[J]. 北京工业大学学报, 2007, 33(8):849-952. Qin X, Lei L, Yao X L. Methods to improve the generalization of BP neural network applied in air pollution forecasting[J]. Journal of Beijing University of Technology, 2007, 33(8):849-952.
[8] 乔俊飞, 蔡杰, 韩红桂. 基于T-S模糊神经网络的PM2.5预测研究[J]. 控制工程, 2018, 25(3):391-396. Qiao J F, Cai J, Han H G. Study on prediction of PM2.5 based on T-S fuzzy neural network[J]. Control Engineering of China, 2018, 25(3):391-396.
[9] Antanasijević D Z, Pocajt V V, Povrenović D S, et al. PM10 emission forecasting using artificial neural networks and genetic algorithm input variable optimization[J]. Science of the Total Environment, 2013, 443(3):511-519.
[10] Sun W, Sun J Y. Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm[J]. Journal of Environmental Management, 2017, 188:144-152.
[11] Zhai B X, Chen J G. Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China[J]. Science of The Total Environment, 2018, 635:644-658.
[12] 高月, 宿翀, 李宏光. 一类基于非线性PCA和深度置信网络的混合分类器及其在PM2.5浓度预测和影响因素诊断中的应用[J]. 自动化学报, 2018, 44(2):318-329. Gao Y, Su C, Li H G. A kind of deep belief networks based on nonlinear features extraction with application to PM2.5 concentration prediction and diagnosis[J]. Acta Automatica Sinica, 2018, 44(2):318-329.
[13] 侯俊雄, 李琦, 朱亚杰, 等. 融合机器学习与WRF大气模式的PM2.5预报方法[J]. 测绘科学, 2018, 43(2):114-120. Hou J X, Li Q, Zhu Y J, et al. PM2.5 forecasting method based on machine learning and WRF hybrid model[J]. Science of Surveying and Mapping, 2018, 43(2):114-120.
[14] 范俊翔, 李琦, 朱亚杰, 等. 基于RNN的空气污染时空预报模型研究[J]. 测绘科学, 2017, 42(7):76-83. Fan J X, Li Q, Zhu Y J, et al. Aspatio-temporal prediction framework for air pollution based on deep RNN[J]. Science of Surveying and Mapping, 2017, 42(7):76-83.
[15] Biancofiore F, Busilacchio M, Verdecchia M, et al. Recursive neural network model for analysis and forecast of PM10 and PM2.5[J]. Atmospheric Pollution Research, 2017, 8:652-659.
[16] 李祥, 彭玲, 邵静, 等. 基于小波分解和ARMA模型的空气污染预报研究[J]. 环境工程, 2016, 34(8):110-113. Li X, Peng L, Shao J, et al. Air pollution forecast based on wavelet decomposition and ARMA model[J]. Environmental Engineering, 2016, 34(8):110-113.
[17] 尹建光, 彭飞, 谢连科, 等. 基于小波分解与自适应多级残差修正的最小二乘支持向量回归预测模型的PM2.5浓度预测[J]. 环境科学学报, 2018, 38(5):2090-2099. Yin J G, Peng F, Xie L K, et al. The study on the prediction of the PM2.5 concentration based on model of the least squares support vector regression under wavelet decomposition and adaptive multiple layer residuals correction[J]. Acta Scientiae Circumstantiae, 2018, 38(5):2090-2099.
[18] Niu M F, Gan K, Sun S L, et al. Application of decomposition-ensemble learning paradigm with phase space reconstruction for day-ahead PM2.5 concentration forecasting[J]. Journal of Environmental Management, 2017, 196:110-118.
[19] Wu Z H, Huang N E. Ensemble empirical mode decomposition:A noise assisted data analysis method[J]. Advances in Adaptive Data Analysis, 2011, 1(1):1-41.
[20] Huang N E, Wu Z. A review on Hilbert-Huang transform:Method and its applications to geophysical studies[J]. Reviews of Geophysics, 2008, 46(2). doi:10.1029/2007RG000228.
[21] Chen T, Guestrin C. XGBoost:A scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2016:785-794.
[22] Bergstra J, Bardenet R, Bengio Y, et al. Algorithms for hyper-parameter optimization[C]//International Conference on Neural Information Processing Systems, Curran Associates Inc, 2011.
[23] Tibshirani R. Regression shrinkage and selection via the lasso:A retrospective[J]. Journal of the Royal Statistical Society, 2011, 73(3):273-282.
[24] Jiang Y, Huang G Q. Short-term wind speed prediction:Hybrid of ensemble empirical mode decomposition, feature selection and error correction[J]. Energy Conversion and Management, 2017, 144:340-350.
[25] Wang J Z, Zhang X B, Guo Z H, et al. Developing an early-warning system for air quality prediction and assessment of cities in China[J]. Expert Systems with Applications, 2017, 84:102-116.
[26] Niu M F, Wang Y F, Sun S L, et al. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting[J]. Atmospheric Environment, 2016, 134:168-180.
[27] Luo H Y, Wang D Y, Yue C Q, et al. Research and application of a novel hybrid decomposition-ensemble learning paradigm with error correction for daily PM10 forecasting[J]. Atmospheric Research, 2018, 201:34-45.
[28] Cheng Y, He K B, Du Z Y, et al. Humidity plays an important role in the PM2.5 pollution in Beijing[J]. Environmental Pollution, 2015, 197:68-75.
[29] Liu H, Tian H Q, Li Y F. Four wind speed multi-step forecasting models using extreme learning machines and signal decomposing algorithms[J]. Energy Conversion and Management, 2015, 100:16-22.
[30] 秦喜文, 刘媛媛, 王新民, 等. 基于整体经验模态分解和支持向量回归的北京市PM2.5预测[J]. 吉林大学学报(地球科学版), 2016, 46(2):563-568. Qin X W, Liu Y Y, Wang X M, et al. PM2.5 prediction of Beijing city based on ensemble empirical mode decomposition and support vector regression[J]. Journal of Jilin University (Earth Science Edition), 2016, 46(2):563-568.
[31] Zhou Q, Jiang H, Wang J, et al. A hybrid model for PM2.5 forecasting based on ensemble empirical mode decomposition and a general regression neural network[J]. Science of the Total Environment, 2014, 496(2):264-274.
[32] Liu Z, Sun W, Zeng J. A new short-term load forecasting method of power system based on EEMD and SS-PSO[J]. Neural Computing and Applications, 2014, 24(3-4):973-983.

Funding

National Natural Science Foundation of China (71874163)
PDF(1507 KB)

Accesses

Citation

Detail

Sections
Recommended

/