4.7 Article

An interpretable machine learning approach for predicting 30-day readmission after stroke

Journal

Publisher

ELSEVIER IRELAND LTD
DOI: 10.1016/j.ijmedinf.2023.105050

Keywords

Stroke; Readmission; Machine learning; SHAP

Ask authors/readers for more resources

This study developed an interpretable machine learning model to predict 30-day readmissions after stroke by extracting 74 features from electronic health records. The results identified severe carotid artery stenosis, homocysteine, glycosylated hemoglobin, sex, lymphocyte percentage, neutrophilic granulocyte percentage, urine glucose, fresh cerebral infarction, and red blood cell count as the top 10 risk factors. The model showed good performance in predicting readmissions and provided valuable insights for treatments.
Background: Stroke is the second leading cause of death worldwide and has a significantly high recurrence rate. We aimed to identify risk factors for stroke recurrence and develop an interpretable machine learning model to predict 30-day readmissions after stroke. Methods: Stroke patients deposited in electronic health records (EHRs) in Xuzhou Medical University Hospital between February 1, 2021, and November 30, 2021, were included in the study, and deceased patients were excluded. We extracted 74 features from EHRs, and the top 20 features (chi-2 value) were used to build machine learning models. 80% of the patients were used for pre-training. Subsequently, a 20% holdout dataset was used for verification. The Shapley Additive exPlanations (SHAP) method was used to explore the interpretability of the model. Results: The cohort included 6,558 patients, of whom the mean (SD) age was 65 (11) years, 3,926 were males (59.86 %), and 132 (2.01 %) were readmitted within 30 days. The area under the receiver operating charac-teristic curve (AUROC) for the optimized model was 0.80 (95 % CI 0.68-0.80). We used the SHAP method to identify the top 10 risk factors (i.e., severe carotid artery stenosis, weak, homocysteine, glycosylated hemoglobin, sex, lymphocyte percentage, neutrophilic granulocyte percentage, urine glucose, fresh cerebral infarction, and red blood cell count). The AUROC of a model with the 10 features was 0.80 (95 % CI 0.69-0.80) and was not significantly different from that of the model with 20 risk factors. Conclusions: Our methods not only showed good performance in predicting 30-day readmissions after stroke but also revealed risk factors that provided valuable insights for treatments.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available