4.8 Article

Construction of a virtual PM2.5 observation network in China based on high-density surface meteorological observations using the Extreme Gradient Boosting model

Journal

ENVIRONMENT INTERNATIONAL
Volume 141, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.envint.2020.105801

Keywords

Virtual PM2.5 observation network; Surface meteorological observations; Visibility; Extreme Gradient Boosting model

Funding

  1. National Science Fund for Distinguished Young Scholars [41825011]
  2. National Key RAMP
  3. D Program Pilot Projects of China [2016YFA0601901, 2016YFC0203304]
  4. National Natural Science Foundation of China [41590874]

Ask authors/readers for more resources

With increasing public concerns on air pollution in China, there is a demand for long-term continuous PM2.5 datasets. However, it was not until the end of 2012 that China established a national PM2.5 observation network. Before that, satellite-retrieved aerosol optical depth (AOD) was frequently used as a primary predictor to estimate surface PM2.5. Nevertheless, satellite-retrieved AOD often encounter incomplete daily coverage due to its sampling frequency and interferences from cloud, which greatly affect the representation of these AOD-based PM2.5. Here, we constructed a virtual ground-based PM2.5 observation network at 1180 meteorological sites across China using the Extreme Gradient Boosting (XGBoost) model with high-density meteorological observations as major predictors. Cross-validation of the XGBoost model showed strong robustness and high accuracy in its estimation of the daily (monthly) PM2.5 across China in 2018, with R-2, root-mean-square error (RMSE) and mean absolute error values of 0.79 (0.92), 15.75 mu g/m(3) (6.75 mu g/m(3)) and 9.89 mu g/m(3) (4.53 mu g/m(3)), respectively. Meanwhile, we find that surface visibility plays the dominant role in terms of the relative importance of variables in the XGBoost model, accounting for 39.3% of the overall importance. We then use meteorological and PM2.5 data in the year 2017 to assess the predictive capability of the model. Results showed that the XGBoost model is capable to accurately hindcast historical PM2.5 at monthly (R-2 = 0.80, RMSE = 14.75 mu g/m(3)), seasonal (R-2 = 0.86, RMSE = 12.28 mu g/m(3)), and annual (R-2 = 0.81, RMSE = 10.10 mu g/m(3)) mean levels. In general, the newly constructed virtual PM2.5 observation network based on high-density surface meteorological observations using the Extreme Gradient Boosting model shows great potential in reconstructing historical PM2.5 at similar to 1000 meteorological sites across China. It will be of benefit to filling gaps in AOD-based PM2.5 data, as well as to other environmental studies including epidemiology.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available