4.3 Article

COVID-19 Risk Mapping with Considering Socio-Economic Criteria Using Machine Learning Algorithms

Publisher

MDPI
DOI: 10.3390/ijerph18189657

Keywords

COVID-19 crisis; data-driven algorithms; geographic information system (GIS); spatial modeling; health geography

Funding

  1. MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program [IITP-2021-2016-0-00312]

Ask authors/readers for more resources

By using machine learning algorithms, a risk mapping of COVID-19 in Tehran was prepared. The results indicated that the central and eastern regions of Tehran are at higher risk, with public transportation stations and pharmacies being the most correlated with the location of COVID-19 patients. The incompatibility in distribution between pharmacies and banks contributes to the spread of COVID-19 in Tehran.
The reduction of population concentration in some urban land uses is one way to prevent and reduce the spread of COVID-19 disease. Therefore, the objective of this study is to prepare the risk mapping of COVID-19 in Tehran, Iran, using machine learning algorithms according to socio-economic criteria of land use. Initially, a spatial database was created using 2282 locations of patients with COVID-19 from 2 February 2020 to 21 March 2020 and eight socio-economic land uses affecting the disease-public transport stations, supermarkets, banks, automated teller machines (ATMs), bakeries, pharmacies, fuel stations, and hospitals. The modeling was performed using three machine learning algorithms that included random forest (RF), adaptive neuro-fuzzy inference system (ANFIS), and logistic regression (LR). Feature selection was performed using the OneR method, and the correlation between land uses was obtained using the Pearson coefficient. We deployed 70% and 30% of COVID-19 patient locations for modeling and validation, respectively. The results of the receiver operating characteristic (ROC) curve and the area under the curve (AUC) showed that the RF algorithm, which had a value of 0.803, had the highest modeling accuracy, which was followed by the ANFIS algorithm with a value of 0.758 and the LR algorithm with a value of 0.747. The results showed that the central and the eastern regions of Tehran are more at risk. Public transportation stations and pharmacies were the most correlated with the location of COVID-19 patients in Tehran, according to the results of the OneR technique, RF, and LR algorithms. The results of the Pearson correlation showed that pharmacies and banks are the most incompatible in distribution, and the density of these land uses in Tehran has caused the prevalence of COVID-19.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available