4.6 Article

Prediction of Daily Mean PM10 Concentrations Using Random Forest, CART Ensemble and Bagging Stacked by MARS

期刊

SUSTAINABILITY
卷 14, 期 2, 页码 -

出版社

MDPI
DOI: 10.3390/su14020798

关键词

air pollution; machine learning; stacking; rotation ensemble; bagging; selective ensemble; diversity strategy

资金

  1. Bulgarian National Science Fund (BNSF) [KP-06-IP-CHINA/1 (K?-06-??-K?TA?/1)]

向作者/读者索取更多资源

A novel framework based on machine learning was developed to predict the daily average concentrations of PM10 in Bulgaria. The framework used meteorological parameters as independent variables and built efficient predictive models to improve accuracy.
A novel framework for stacked regression based on machine learning was developed to predict the daily average concentrations of particulate matter (PM10), one of Bulgaria's primary health concerns. The measurements of nine meteorological parameters were introduced as independent variables. The goal was to carefully study a limited number of initial predictors and extract stochastic information from them to build an extended set of data that allowed the creation of highly efficient predictive models. Four base models using random forest, CART ensemble and bagging, and their rotation variants, were built and evaluated. The heterogeneity of these base models was achieved by introducing five types of diversities, including a new simplified selective ensemble algorithm. The predictions from the four base models were then used as predictors in multivariate adaptive regression splines (MARS) models. All models were statistically tested using out-of-bag or with 5-fold and 10-fold cross-validation. In addition, a variable importance analysis was conducted. The proposed framework was used for short-term forecasting of out-of-sample data for seven days. It was shown that the stacked models outperformed all single base models. An index of agreement IA = 0.986 and a coefficient of determination of about 95% were achieved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据