Journal
ATMOSPHERIC ENVIRONMENT
Volume 244, Issue -, Pages -Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.atmosenv.2020.117971
Keywords
Spatiotemporal; Air pollution; Land use regression model; Long-term exposure; Epidemiology; Support vector regression
Ask authors/readers for more resources
The study proposes a hybrid spatiotemporal LUR model system combining support vector regression, multiple linear regression, and a special spatiotemporal algorithm to improve the accuracy of predicting pollutant concentration surfaces and reduce misclassification errors in epidemiological studies.
Air pollution has become a global problem and can cause serious damage to human health. Epidemiological studies on the long-term exposure to air pollution can reveal the extent of this damage. Spatiotemporal land use regression (LUR) models can be used to obtain long-term pollutant concentration surfaces with high spatiotemporal resolution. However, previously established spatiotemporal LUR models generally exhibit poor spatial prediction performances in some time panels compared with their average performances. These inaccurate pollutant concentrations lead to misclassification errors in epidemiological studies. To solve this problem, a hybrid spatiotemporal LUR model system is proposed in this study, which consists of support vector regression (SVR), multiple linear regression (MLR), and a special spatiotemporal (ST) algorithm. Three SVR layers were used for the main prediction, whereas MLR and ST were used to supplement time panels with poor spatial prediction performances. In addition, temporal segmentation modeling was adopted for SVR to further improve the performance. We used the megacity Tianjin in China for our case study and six target air pollutants (CO, NO2, O-3, PM10, PM2.5, and SO2). The superiority of our model system was tested by cross-validation. The results show that the number of days on which the R(2)cv of the model is higher than 0.6 for CO, NO2, O-3, PM10, PM2.5, and SO2 is 363, 364, 362, 357, 360, and 362, respectively, whereas the mean of the daily R(2)cv on these days is 0.911, 0.903, 0.891, 0.879, 0.866, and 0.883, respectively. Based on the use of our model system, a relatively high spatial prediction performance was achieved for almost all time panels. This model system can be applied to cohort health studies to obtain the pollutant concentration surfaces of any time panel with high reliability and reduce the exposure measurement errors of misclassifications.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available