4.6 Article

Auxiliary Diagnosis of Breast Cancer Based on Machine Learning and Hybrid Strategy

期刊

IEEE ACCESS
卷 11, 期 -, 页码 96374-96386

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2023.3312305

关键词

Breast cancer; Sampling; Predictive models; Machine learning; Feature extraction; Data models; Classification algorithms; Clinical diagnosis; machine learning; sample balancing; feature selection; classification forecast

向作者/读者索取更多资源

This study focuses on breast cancer and proposes a hybrid strategy combined with machine learning methods to build an accurate and efficient breast cancer auxiliary diagnosis model. Experimental results show that the new approach achieves better prediction results compared to previous methods.
Breast cancer has replaced lung cancer as the number one cancer among women worldwide. In this paper, we take breast cancer as the research object, and pioneer a hybrid strategy to process the data, and combine the machine learning method to build a more accurate and efficient breast cancer auxiliary diagnosis model. First, the combined sampling method SMOTE-ENN is used to solve the problem of sample imbalance, and the data are standardized to make the data have better separability. Then, the features of the dataset are initially screened using the mutual information method, and further secondary feature selection is performed using the recursive feature elimination method based on the XGBoost algorithm. Thus, the feature dimensionality of the dataset is reduced and the generalization ability of the model is improved. Finally, five different machine learning models are used for classification prediction, the best combination of parameters for each model is found using a grid search method, and the final results of each model are derived using a 10-fold cross-validation method. The experiments are conducted using the Wisconsin Diagnostic Breast Cancer dataset (WDBC), and the results of the study find that after the data are processed by the hybrid strategy, the best prediction results are obtained using the RF model with 99.52% accuracy, which is better than the previous research methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据