☆ 4.6 Article

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations

PLOS ONE (2021)

Journal

PLOS ONE

Volume 16, Issue 6, Pages -

Publisher

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pone.0252585

Keywords

Funding

Penn Medicine, University of Pennsylvania Health System

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study developed and validated a claims-based, machine learning risk model for predicting clinical outcomes in medical and surgical patient populations. The risk model showed high predictive accuracy for 30-day mortality and most adverse events, with moderate accuracy for rehospitalization and some adverse events. The machine learning algorithm performed well on a second independent dataset, confirming its reliability and generalizability.

Objective This study aimed to develop and validate a claims-based, machine learning algorithm to predict clinical outcomes across both medical and surgical patient populations. Methods This retrospective, observational cohort study, used a random 5% sample of 770,777 fee-for-service Medicare beneficiaries with an inpatient hospitalization between 2009-2011. The machine learning algorithms tested included: support vector machine, random forest, multilayer perceptron, extreme gradient boosted tree, and logistic regression. The extreme gradient boosted tree algorithm outperformed the alternatives and was the machine learning method used for the final risk model. Primary outcome was 30-day mortality. Secondary outcomes were: rehospitalization, and any of 23 adverse clinical events occurring within 30 days of the index admission date. Results The machine learning algorithm performance was evaluated by both the area under the receiver operating curve (AUROC) and Brier Score. The risk model demonstrated high performance for prediction of: 30-day mortality (AUROC = 0.88; Brier Score = 0.06), and 17 of the 23 adverse events (AUROC range: 0.80-0.86; Brier Score range: 0.01-0.05). The risk model demonstrated moderate performance for prediction of: rehospitalization within 30 days (AUROC = 0.73; Brier Score: = 0.07) and six of the 23 adverse events (AUROC range: 0.74-0.79; Brier Score range: 0.01-0.02). The machine learning risk model performed comparably on a second, independent validation dataset, confirming that the risk model was not overfit. Conclusions and relevance We have developed and validated a robust, claims-based, machine learning risk model that is applicable to both medical and surgical patient populations and demonstrates comparable predictive accuracy to existing risk models.

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations

Journal

PLOS ONE

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations

Journal

PLOS ONE

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper