☆ 4.7 Article

Feature extraction and prediction of fine particulate matter (PM2.5) chemical constituents using four machine learning models

EXPERT SYSTEMS WITH APPLICATIONS (2023)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 221, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2023.119696

关键词

PM2; 5 chemical constituents; Machine learning; Generative adversarial imputation networks; Fully connected deep neural network; Random forest; K-nearest neighbor

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study used four machine learning models to predict the concentrations of PM2.5 constituents at three sites in South Korea. The GAIN model showed the highest prediction accuracy, and the accuracy decreased when data was missing.

The concentrations of fine particulate matter (PM2.5) constituents, which are very important and essential in-formation for the identification of air pollution sources, were predicted at three sites (Seoul, Ulsan, Baeng-nyeong) in South Korea between 2016 and 2018 using four machine learning (ML) models: generative adversarial imputation network (GAIN), fully connected deep neural network (FCDNN), random forest (RF), and k-nearest neighbor (kNN). 3 PM2.5 constituent groups, namely 8 ions, 2 carbons, and 15 trace elements, were targeted for prediction. The latest hyperparameter optimization techniques were used to learn air pollution characteristics from ambient PM2.5-related information, such as time, meteorology, and air pollutant concen-trations. We compared the feature extraction abilities of the four models. The prediction accuracy identified by the coefficient of determination (R2) between prediction and observation was highest in GAIN, followed by FCDNN and RF or kNN. On availability of data on the time, air pollutant concentrations, and/or meteorology, simultaneously missed 20 % data of all PM2.5 constituent groups were predicted, with R2 = 0.897, 0.861, 0.785, and 0.801 by the GAIN, FCDNN, RF, and kNN, respectively. As missing ratios (20 %, 40 %, 60 %, 80 %) of input data increased, prediction accuracy decreased in the four models and was predominantly more noticeable in GAIN and kNN. As the available period of data increased, prediction accuracy increased in GAIN and FCDNN. Trace elements were predicted with the lowest R2 in all models among the target constituent groups. Study sites with more emission sources showed lower prediction accuracy, resulting in the highest R2 in Baengnyeong island and the lowest in Ulsan. According to the current findings, ML models can be used to evaluate various air pollution issues for which data is missing.

Feature extraction and prediction of fine particulate matter (PM2.5) chemical constituents using four machine learning models

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Feature extraction and prediction of fine particulate matter (PM2.5) chemical constituents using four machine learning models

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文