4.7 Article

Improved disease diagnosis system for COVID-19 with data refactoring and handling methods

Journal

FRONTIERS IN PSYCHOLOGY
Volume 13, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fpsyg.2022.951027

Keywords

classification; disease diagnosis; COVID-19; data augmentation; data pre-processing

Ask authors/readers for more resources

This research focuses on providing a better solution for diagnosing COVID-19 and addressing the issue of limited data for disease prediction models. By developing various machine learning models and applying data augmentation techniques, the authors have achieved improved accuracy in diagnosing COVID-19.
The novel coronavirus illness (COVID-19) outbreak, which began in a seafood market in Wuhan, Hubei Province, China, in mid-December 2019, has spread to almost all countries, territories, and places throughout the world. And since the fault in diagnosis of a disease causes a psychological impact, this was very much visible in the spread of COVID-19. This research aims to address this issue by providing a better solution for diagnosis of the COVID-19 disease. The paper also addresses a very important issue of having less data for disease prediction models by elaborating on data handling techniques. Thus, special focus has been given on data processing and handling, with an aim to develop an improved machine learning model for diagnosis of COVID-19. Random Forest (RF), Decision tree (DT), K-Nearest Neighbor (KNN), Logistic Regression (LR), Support vector machine, and Deep Neural network (DNN) models are developed using the Hospital Israelita Albert Einstein (in Sao Paulo, Brazil) dataset to diagnose COVID-19. The dataset is pre-processed and distributed DT is applied to rank the features. Data augmentation has been applied to generate datasets for improving classification accuracy. The DNN model dominates overall techniques giving the highest accuracy of 96.99%, recall of 96.98%, and precision of 96.94%, which is better than or comparable to other research work. All the algorithms are implemented in a distributed environment on the Spark platform.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available