4.4 Article

The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models

期刊

出版社

BMC
DOI: 10.1186/s12911-021-01480-3

关键词

Machine learning; Asymptomatic carotid atherosclerosis; Electronic health records; Prediction

资金

  1. National Natural Science Foundation of China [81070999]
  2. Foundation of Shaanxi social development and technology research project [2016SF-020]
  3. Foundation of Xi'an Science and technology plan project [2019114613YX001SF039(2)]
  4. new medical technology of the Second Affiliated Hospital of Xi'an Jiaotong University [2019-32, 2018-16, 2010-22]
  5. Fundamental Research Funds for the Central Universities (Xi'an Jiaotong University) [xjj2014153, 2009-95]
  6. Foundation of Second Affiliated Hospital of Xi'an Jiaotong University [RC(GG)201109]

向作者/读者索取更多资源

A study used machine learning models to predict asymptomatic CAS, finding that the LR model showed the best predictive performance, laying the foundation for establishing an early warning system to allocate CAS prevention measures more accurately.
Background Screening carotid B-mode ultrasonography is a frequently used method to detect subjects with carotid atherosclerosis (CAS). Due to the asymptomatic progression of most CAS patients, early identification is challenging for clinicians, and it may trigger ischemic stroke. Recently, machine learning has shown a strong ability to classify data and a potential for prediction in the medical field. The combined use of machine learning and the electronic health records of patients could provide clinicians with a more convenient and precise method to identify asymptomatic CAS. Methods Retrospective cohort study using routine clinical data of medical check-up subjects from April 19, 2010 to November 15, 2019. Six machine learning models (logistic regression [LR], random forest [RF], decision tree [DT], eXtreme Gradient Boosting [XGB], Gaussian Naive Bayes [GNB], and K-Nearest Neighbour [KNN]) were used to predict asymptomatic CAS and compared their predictability in terms of the area under the receiver operating characteristic curve (AUCROC), accuracy (ACC), and F1 score (F1). Results Of the 18,441 subjects, 6553 were diagnosed with asymptomatic CAS. Compared to DT (AUCROC 0.628, ACC 65.4%, and F1 52.5%), the other five models improved prediction: KNN + 7.6% (0.704, 68.8%, and 50.9%, respectively), GNB + 12.5% (0.753, 67.0%, and 46.8%, respectively), XGB + 16.0% (0.788, 73.4%, and 55.7%, respectively), RF + 16.6% (0.794, 74.5%, and 56.8%, respectively) and LR + 18.1% (0.809, 74.7%, and 59.9%, respectively). The highest achieving model, LR predicted 1045/1966 cases (sensitivity 53.2%) and 3088/3566 non-cases (specificity 86.6%). A tenfold cross-validation scheme further verified the predictive ability of the LR. Conclusions Among machine learning models, LR showed optimal performance in predicting asymptomatic CAS. Our findings set the stage for an early automatic alarming system, allowing a more precise allocation of CAS prevention measures to individuals probably to benefit most.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据