4.6 Article

Optimizing Coronary Artery Disease Diagnosis: A Heuristic Approach Using Robust Data Preprocessing and Automated Hyperparameter Tuning of eXtreme Gradient Boosting

期刊

IEEE ACCESS
卷 11, 期 -, 页码 112988-113007

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2023.3324037

关键词

BorutaShap; coronary artery disease; data preprocessing; feature selection; Optuna hyper-parameter; XGB

向作者/读者索取更多资源

Coronary Artery Disease (CAD) is a prevalent disorder that requires low-cost automated technology for early diagnosis and treatment. Traditional machine learning methods may not be suitable for smaller clinical datasets with categorical features, necessitating alternative approaches such as data preprocessing and feature selection. The developed BSOXGB model achieves remarkable accuracy on the Z-Alizadeh Sani dataset, making it a practical solution for automatically detecting and diagnosing CAD.
Coronary Artery Disease (CAD) is an increasingly prevalent disorder that significantly affects both longevity and quality of life, particularly among people aged 30 to 60. Lifestyle, genetics, nutrition, and stress are contributing factors to the increasing mortality rates. Therefore, there is a need for low-cost automated technology to diagnose CAD early and help medical practitioners treat chronic illnesses effectively. However, machine learning methods are typically designed to perform well with large datasets and may not be well-suited for smaller clinical datasets that contain categorical features. To address this issue, alternative approaches such as Data Preprocessing (DP), feature selection techniques, and Hyperparameter tuning (HP) are necessary to achieve optimal performance and accuracy on such datasets. Data preprocessing is also crucial to obtain accurate results by eliminating noise, handling missing values, and dealing with outliers. To address the challenges associated with feature selection, manual selection of hyperparameter tuning, and optimization, we have developed a novel model called BSOXGB (BorutaShap feature selection based Optuna hyperparameter tuning of eXtremely Gradient Boost). The proposed model achieves a remarkable accuracy of 97.70%, outperforming other classifiers like Random Forest (RF), AdaBoost (AB), CatBoost (CB), and ExtraTrees (ET) on the publicly available Z-Alizadeh Sani dataset. BSOXGB, with only 9 relevant features out of 56, has the highest classification accuracy, demonstrating its potential as a practical solution for automatically detecting and diagnosing CAD in the real world.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据