3.8 Proceedings Paper

Ensemble Gain Ratio Feature Selection (EGFS) Model with Machine Learning and Data Mining Algorithms for Disease Risk Prediction

出版社

IEEE
DOI: 10.1109/icict48043.2020.9112406

关键词

Ensemble gain ratio feature selection (EGFS) model; machine learning; data mining; health care; disease risk prediction; thyroid; random forest; KNN; logistic regression; naive bayes

向作者/读者索取更多资源

Machine Learning (Ml) and Data Mining (DM) play a vital role in enhancing the performance of tasks such as disease risk prediction in healthcare communities, resulting in better serving of the societies. A chance of 12% error remains in the diagnosis of the diseases by the medical practitioners as proven in the literature works. To reduce the error rate and further improve the performance, a novel Ensemble Gain ratio Feature Selection (EGFS) model is introduced to extract the most important features, which are highly contributing. The accuracy, Area Under Curve (AUC), and other evaluation metrics are used instead of only the accuracy as it results in a misleading prediction for an imbalanced dataset and may provide wrong diagnosis causing serious damage to the patient's health or even losing lives. The thyroid disease dataset of UCIML repository is used in the experiment. The EGFS model that consists of an ensemble algorithm i.e., the random forest and the gain ration algorithm, finds the most relevant and contributing features, is then aligned with the ML and DM algorithms such as k-nearest-neighbor, logistic regression, and naive bayes. The highest accuracy achieved by the proposed EGFS model is 96.49% and the highest AUC recorded is 99.10% These results significantly improve the disease risk prediction and are higher than many recent research works, while utilizing only four most relevant features out of the twenty eight features present in the dataset, i.e., the percentage of features reduced is 85.71%.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据