☆ 4.5 Article

Discovering relationships and forecasting PM10 and PM2.5 concentrations in Bogota, Colombia, using Artificial Neural Networks, Principal Component Analysis, and k-means clustering

ATMOSPHERIC POLLUTION RESEARCH (2018)

Journal

ATMOSPHERIC POLLUTION RESEARCH

Volume 9, Issue 5, Pages 912-922

Publisher

TURKISH NATL COMMITTEE AIR POLLUTION RES & CONTROL-TUNCAP

DOI: 10.1016/j.apr.2018.02.006

Keywords

Air quality modelling; k-means clustering; Particulate matter; Data mining; Tropical cities

Funding

Universidad de La Sabana [ING-192]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Air pollution is an important matter for local authorities in Bogota (Colombia), with PM10 and PM2.5 being the most serious air pollutants in the city. In the present study, data mining algorithms were used to establish the most influential meteorological variables on air pollution in Bogota, and to develop models to forecast PM10 and PM2.5 to help local authorities prevent human exposure to high levels of pollution. To achieve the main objectives, data were collected between 2010 and 2015 from 13 local stations in a monitoring network. A data quality analysis was done to determine the most and least polluted stations. Kennedy and Parque Simon Bolivar stations were selected as the most and least polluted stations, respectively, to use to develop the forecasting models. Principal Component Analysis (PCA) was used to determine the variables that most influenced the behaviour of the data. Then, models were developed to forecast average PM10 and PM2.5 concentrations for the next day using Artificial Neural Networks (ANN) and k-means clustering. The input variables for the ANN models were selected based on the PCA results. k-means clustering was applied to group the data, and the results were used as inputs to the forecasting models. It was possible to forecast average PM10 and PM2.5 concentrations for the next 24 h by developing forecasting models that used Multi-Layer Perceptron with the consideration of k-means clustering results. It was demonstrated that considering clustering results as input variables improves PM10 and PM2.5 forecasting models for the most polluted station. Finally, hourly forecast models for PM10 and PM2.5 were developed and evaluated at Kennedy station. The developed models can be used as references for the issuance of early warnings for high air pollution because of their ability to accurately predict high pollution incidents.

Discovering relationships and forecasting PM10 and PM2.5 concentrations in Bogota, Colombia, using Artificial Neural Networks, Principal Component Analysis, and k-means clustering

Journal

ATMOSPHERIC POLLUTION RESEARCH

Publisher

TURKISH NATL COMMITTEE AIR POLLUTION RES & CONTROL-TUNCAP

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Discovering relationships and forecasting PM10 and PM2.5 concentrations in Bogota, Colombia, using Artificial Neural Networks, Principal Component Analysis, and k-means clustering

Journal

ATMOSPHERIC POLLUTION RESEARCH

Publisher

TURKISH NATL COMMITTEE AIR POLLUTION RES & CONTROL-TUNCAP

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper