4.6 Article

Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus

Journal

MULTIMEDIA SYSTEMS
Volume 28, Issue 4, Pages 1289-1307

Publisher

SPRINGER
DOI: 10.1007/s00530-021-00817-2

Keywords

Decision tree; Genetic algorithm; SMOTE; Data classification; Healthcare; Machine learning

Ask authors/readers for more resources

The study introduced a novel prediction model, PMSGD, and through different layers of processing, it improved the accuracy and effectiveness of diabetes classification.
Diabetes mellitus is a well-known chronic disease that diminishes the insulin producing capability of the human body. This results in high blood sugar level which might lead to various complications such as eye damage, nerve damage, cardiovascular damage, kidney damage and stroke. Although diabetes has attracted huge research attention, the overall performance of such medical disease classification using machine learning techniques is relatively low, majorly due to existence of class imbalance and missing values in the data. In this paper, we propose a novel Prediction Model using Synthetic Minority Oversampling Technique, Genetic Algorithm and Decision Tree (PMSGD) for Classification of Diabetes Mellitus on Pima Indians Diabetes Database (PIDD) dataset. The framework of the proposed PMSGD prediction model is composed of four different layers. The first layer is the pre-processing layer which is responsible for handling missing values, detection of outlier and oversampling the minority class. In the second layer, the most significant features are selected using correlation and genetic algorithm. In the third layer, the proposed model is trained, and its effectiveness is evaluated in the fourth layer in terms of classification accuracy (CA), classification error (CE), precision, recall (sensitivity), measure (FM), and Area_Under_ROC (AUROC). The proposed PMSGD algorithm clearly outperforms its counterparts and achieves a remarkable accuracy of 82.1256%. The best outcome achieved by the proposed system in terms of CA, CE, precision, sensitivity, FM and AUROC is 82.1256%, 17.8744%, 0.8070%, 0.8598, 0.8326 and 0.8511, respectively. The obtained simulation results show the effectiveness and superiority of our proposed PMSGD model and their by reduced error rate to help in decision-making process.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available