☆ 4.6 Article

Massive external validation of a machine learning algorithm to predict pulmonary embolism in hospitalized patients

THROMBOSIS RESEARCH (2022)

Journal

THROMBOSIS RESEARCH

Volume 216, Issue -, Pages 14-21

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.thromres.2022.05.016

Keywords

Pulmonary embolism; Thrombosis; Machine learning; Prophylaxis; External validation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study conducted a large-scale external validation of a machine learning-based PE prediction model. The results showed that the model performed well across different patient populations, demonstrating its generalizability and potential as a clinical decision support tool to aid PE detection and improve patient outcomes.

Background: Pulmonary embolism (PE) is a life-threatening condition associated with ~10% of deaths of hospitalized patients. Machine learning algorithms (MLAs) which predict the onset of pulmonary embolism (PE) could enable earlier treatment and improve patient outcomes. However, the extent to which they generalize to broader patient populations impacts their clinical utility. Objective: To conduct the first large-scale external validation of a machine learning-based PE prediction model which uses EHR data from the first three hours of a patient's hospital stay to predict the occurrence of PE within the next 10 days of the inpatient stay. Methods: This retrospective study included approximately two million adult hospital admissions across 44 medical institutions in the US from 2011 to 2017. Demographics, vital signs, and lab tests from adult inpatients at 12 institutions (n = 331,268; 3.3% PE positive) were used for training an XGBoost model. External validation of the model was conducted on patient populations from each of 32 medical institutions (total n = 1,660,715; 3.7% PE positive) without retraining. Model performance was assessed using the area under the receiver operating characteristic curve (AUROC). Backward elimination regression was used to identify correlations between characteristics of the external validation sets and AUROC. Results: The model performed well (AUROC = 0.87) on the 20% hold-out subset of the training set. Despite demographic differences between the 32 external validation populations (percent PE positive: min = 1.54%, max = 6.47%), without retraining, the model had excellent discrimination, with a mean AUROC of 0.88 (min = 0.79, max = 0.93). Fixing sensitivity at 0.80, the model had a mean specificity of 0.85 (min = 0.64, max = 0.93). Backward elimination regression identified a negative association (beta = -0.015, p < 0.001) between the percentage of PE positive encounters and AUROC. Conclusions: A PE prediction model performed remarkably well across 32 different external patient populations without retraining and despite significant differences in demographic characteristics, demonstrating its generalizability and potential as a clinical decision support tool to aid PE detection and improve patient outcomes in a clinical setting.

Massive external validation of a machine learning algorithm to predict pulmonary embolism in hospitalized patients

Journal

THROMBOSIS RESEARCH

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Massive external validation of a machine learning algorithm to predict pulmonary embolism in hospitalized patients

Journal

THROMBOSIS RESEARCH

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper