4.2 Article

Comparison of machine learning models for predicting the risk of breast cancer- related lymphedema in Chinese women

期刊

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.apjon.2022.100101

关键词

Breast cancer-related lymphedema; Machine learning; Na?ve Bayes; Logistic regression; K-nearest neighbors; Support vector machine; Multilayer perceptron; Prediction

类别

资金

  1. National Natural Science Foundation of China
  2. [72004039]

向作者/读者索取更多资源

This study aimed to develop and validate classification models using machine learning algorithms to predict breast cancer-related lymphedema (BCRL) in Chinese women. The logistic regression model achieved the best performance, and the most important predictors were the number of positive lymph nodes, BCRL occurring on the same side as the surgery, a history of sentinel lymph node biopsy, a dietary preference for meat and fried food, and an exercise frequency of less than three times per week.
Objective: Predictive models for the occurrence of cancer symptoms by using machine learning (ML) algorithms could be used to aid clinical decision-making in order to enhance the quality of cancer care. This study aimed to develop and validate a selection of classification models that used ML algorithms to predict the occurrence of breast cancer-related lymphedema (BCRL) among Chinese women. Methods: This was a retrospective cohort study of consecutive cases that had been diagnosed with breast cancer, stages I-IV. Forty-eight variables were grouped into five feature sets. Five classification models with ML algorithms were developed, and the models' performance and the variables' relative importance were assessed accordingly. Results: Of 370 eligible female participants, 91 had BCRL (24.6%). The mean age of this study sample was 49.89 (SD = 7.45). All participants had had breast cancer surgery, and more than half of them had had a modified radical mastectomy (n = 206, 55.5%). The mean follow-up time after breast cancer surgery was 28.73 months (SD =11.71). Most of the tumors were either stage I (n =49, 31.2%) or stage II (n = 252, 68.1%). More than half of the sample had had postoperative chemotherapy (n = 227, 61.4%). Overall, the logistic regression model achieved the best performance in terms of accuracy (91.6%), precision (82.1%), and recall (91.4%) for BCRL. Although this study included 48 predicting variables, we found that the five models required only 22 variables to achieve predictive performance. The most important variable was the number of positive lymph nodes, followed in descending order by the BCRL occurring on the same side as the surgery, a history of sentinel lymph node biopsy, a dietary preference for meat and fried food, and an exercise frequency of less than three times per week. These factors were the most influential predictors for enhancing the ML models' performance.Conclusions: This study found that in the ML training dataset, the multilayer perceptron model and the logistic regression model were the best discrimination models for predicting the outcome of BCRL, and the k-nearest neighbors and support vector machine models demonstrated good calibration performance in the ML validation dataset. Future research will need to use large-sample datasets to establish a more robust ML model for predicting BCRL deeply and reliably.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据