4.6 Article

Evaluation of Machine Learning Models for Estimating PM2.5 Concentrations across Malaysia

期刊

APPLIED SCIENCES-BASEL
卷 11, 期 16, 页码 -

出版社

MDPI
DOI: 10.3390/app11167326

关键词

PM2; 5; Himawari-8; random forest; support vector regression; air pollution; Malaysia

资金

  1. Ministry of Education, Malaysia [FRGS/1/2019/WAB05/UTM/02/3]
  2. WNI WXBUNKA Foundation, Japan [R.J130000.7352.4B406]

向作者/读者索取更多资源

The study utilizes machine-learning models to estimate PM2.5 concentrations across Malaysia, showing higher levels at urban/industrial sites and lower levels at suburban/rural sites. Seasonal variations in PM2.5 concentrations, with the highest levels during the dry season, were also recorded. The Random Forest model displayed slightly better performance than Support Vector Regression for most models.
Southeast Asia (SEA) is a hotspot region for atmospheric pollution and haze conditions, due to extensive forest, agricultural and peat fires. This study aims to estimate the PM2.5 concentrations across Malaysia using machine-learning (ML) models like Random Forest (RF) and Support Vector Regression (SVR), based on satellite AOD (aerosol optical depth) observations, ground measured air pollutants (NO2, SO2, CO, O-3) and meteorological parameters (air temperature, relative humidity, wind speed and direction). The estimated PM2.5 concentrations for a two-year period (2018-2019) are evaluated against measurements performed at 65 air-quality monitoring stations located at urban, industrial, suburban and rural sites. PM2.5 concentrations varied widely between the stations, with higher values (mean of 24.2 +/- 21.6 mu g m(-3)) at urban/industrial stations and lower (mean of 21.3 +/- 18.4 mu g m(-3)) at suburban/rural sites. Furthermore, pronounced seasonal variability in PM2.5 is recorded across Malaysia, with highest concentrations during the dry season (June-September). Seven models were developed for PM2.5 predictions, i.e., separately for urban/industrial and suburban/rural sites, for the four dominant seasons (dry, wet and two inter-monsoon), and an overall model, which displayed accuracies in the order of R-2 = 0.46-0.76. The validation analysis reveals that the RF model (R-2 = 0.53-0.76) exhibits slightly better performance than SVR, except for the overall model. This is the first study conducted in Malaysia for PM2.5 estimations at a national scale combining satellite aerosol retrievals with ground-based pollutants, meteorological factors and ML techniques. The satisfactory prediction of PM2.5 concentrations across Malaysia allows a continuous monitoring of the pollution levels at remote areas with absence of measurement networks.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据