4.7 Article

Evaluation of gap-filling approaches in satellite-based daily PM2.5 prediction models

Journal

ATMOSPHERIC ENVIRONMENT
Volume 244, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.atmosenv.2020.117921

Keywords

PM2.5; Satellite data; Gap-filling approaches; Random forest; CMAQ

Funding

  1. National Natural Science Foundation of China [41921005, 41625020]
  2. Chinese Postdoctoral Science Foundation [2019M660674]

Ask authors/readers for more resources

This study reviewed and compared four gap-filling strategies for high-resolution PM2.5 predictions and found that regression-based methods were more robust, while decision tree filling was more time-efficient. Additionally, CTM simulations were beneficial for improving the accuracy of PM2.5 spatial distribution predictions in all models.
Approximately half of satellite aerosol retrievals are missing that limits the application of satellite data in PM2.5 pollution monitoring. To obtain spatiotemporally continuous PM2.5 distributions, various gap-filling methods have been developed, but have rarely been evaluated. Here, we reviewed and summarized four types of gap-filling strategies, and applied them to a random forest PM2.5 prediction model that incorporated ground observations, chemical transport model (CTM) simulations, and satellite AOD for predicting daily PM2.5 concentrations at a 1-km resolution in 2013 in the Beijing-Tianjin-Hebei region and the Yangtze River Delta. The model out-of-bag predictions were compared with national station measurements and external measurements to assess the performance of different gap-filling methods. We also conducted a by-city cross-validation and characterized the spatial distributions of PM2.5 prediction when the AOD coverage was low. We found that the methods filling in missing data by regression, i.e. multiple imputation and decision tree, performed robustly to characterizing PM2.5 variation at a high spatial resolution and the method filling in missing PM2.5 predictions with decision tree overcame the problem of time-consuming computations. The method using spatiotemporal trends to fill in missing data, i.e. ordinary kriging and generalized additive mixed model, may be overrated in statistical evaluation tests, and predicted artificially oversmoothed PM2.5 spatial distributions. We also revealed that CTM simulations benefited the prediction of PM2.5 spatial distribution in all the models with various gap-filling strategies with higher prediction accuracy in the by-city cross-validation. We noticed that the PM2.5 prediction was not sensitive to the resolution of CTM simulations and even the 12-km resolution CTM simulations benefited the high-resolution PM2.5 prediction.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available