4.4 Article

Application of Gaussian Mixture Regression for the Correction of Low Cost PM2.5 Monitoring Data in Accra, Ghana

期刊

ACS EARTH AND SPACE CHEMISTRY
卷 5, 期 9, 页码 2268-2279

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acsearthspacechem.1c00217

关键词

low-cost sensors; particulate matter; air quality; Africa; Gaussian mixture regression

资金

  1. NSF OISE [2020677]
  2. Columbia Center for Climate and Life
  3. NASA Postdoctoral Program at the Goddard Space Flight Center
  4. Office Of The Director
  5. Office Of Internatl Science &Engineering [2020677] Funding Source: National Science Foundation

向作者/读者索取更多资源

Low-cost sensors have the potential to improve air quality data coverage, but require collocation and calibration with reference monitors for high-quality data. This study introduces Gaussian mixture regression for air quality data calibration, showing improvement over traditional methods by increasing correlation and accuracy. Additionally, this method allows for estimation of calibration certainty and demonstrates consistent clustering with climate characteristics for learning underlying data relationships.
Low-cost sensors (LCSs) for air quality monitoring have enormous potential to improve air quality data coverage in resource-limited parts of the world such as sub-Saharan Africa. LCSs, however, are affected by environment and source conditions. To establish high-quality data, LCSs must be collocated and calibrated with reference grade PM2.5 monitors. From March 2020, a low-cost PurpleAir PM2.5 monitor was collocated with a Met One Beta Attenuation Monitor 1020 in Accra, Ghana. While previous studies have shown that multiple linear regression (MLR) and random forest regression (RF) can improve accuracy and correlation between PurpleAir and reference data, MLR and RF yielded suboptimal improvement in the Accra collocation (R-2 = 0.81 and R-2 = 0.81, respectively). We present the first application of Gaussian mixture regression (GMR) to air quality data calibration and demonstrate improvement over traditional methods by increasing the collocated PM2.5, correlation and accuracy to R-2 = 0.88 and MAE = 2.2 mu g m(-3). Gaussian mixture models (GMMs) are a probability density estimator and clustering method from which nonlinear regressions that tolerate missing inputs can be derived. We find that even when given missing inputs, GMR provides better correlation than MLR and RF performed with complete data. GMR also allows us to estimate calibration certainty. When evaluated, 95% confidence intervals agreed with reference PM2.5 data 96% of the time, suggesting that the model accurately assesses its own confidence. Additionally, clustering within the GMM is consistent with climate characteristics, providing confidence that the calibration approach can learn underlying relationships in data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据