☆ 4.7 Article

Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

NPJ DIGITAL MEDICINE (2021)

Journal

NPJ DIGITAL MEDICINE

Volume 4, Issue 1, Pages -

Publisher

NATURE PORTFOLIO

DOI: 10.1038/s41746-021-00426-3

Keywords

Funding

NLM [T15LM007092]
NICHD [T32HD040128]
NIH NHLBI [7K01HL141771]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Machine learning models trained on clinician-initiated administrative data show performance close to EMR-based benchmarks for inpatient outcomes, but exhibit declines in performance when dealing with specific patient populations, such as myocardial infarction patients. The results highlight the importance of physician diagnosis in the prognostic performance of these models and suggest that models with similar performance may derive their signal from observing clinical behavior to generate predictions. Performance exceeding these benchmarks is necessary for models to guide clinicians in individual decisions.

Machine learning can help clinicians to make individualized patient predictions only if researchers demonstrate models that contribute novel insights, rather than learning the most likely next step in a set of actions a clinician will take. We trained deep learning models using only clinician-initiated, administrative data for 42.9 million admissions using three subsets of data: demographic data only, demographic data and information available at admission, and the previous data plus charges recorded during the first day of admission. Models trained on charges during the first day of admission achieve performance close to published full EMR-based benchmarks for inpatient outcomes: inhospital mortality (0.89 AUC), prolonged length of stay (0.82 AUC), and 30-day readmission rate (0.71 AUC). Similar performance between models trained with only clinician-initiated data and those trained with full EMR data purporting to include information about patient state and physiology should raise concern in the deployment of these models. Furthermore, these models exhibited significant declines in performance when evaluated over only myocardial infarction (MI) patients relative to models trained over MI patients alone, highlighting the importance of physician diagnosis in the prognostic performance of these models. These results provide a benchmark for predictive accuracy trained only on prior clinical actions and indicate that models with similar performance may derive their signal by looking over clinician's shoulders-using clinical behavior as the expression of preexisting intuition and suspicion to generate a prediction. For models to guide clinicians in individual decisions, performance exceeding these benchmarks is necessary.

Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

Journal

NPJ DIGITAL MEDICINE

Publisher

NATURE PORTFOLIO

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians?

Journal

NPJ DIGITAL MEDICINE

Publisher

NATURE PORTFOLIO

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper