☆ 4.4 Article

Predicting in-hospital length of stay: a two-stage modeling approach to account for highly skewed data

BMC MEDICAL INFORMATICS AND DECISION MAKING (2022)

Journal

BMC MEDICAL INFORMATICS AND DECISION MAKING

Volume 22, Issue 1, Pages -

Publisher

BMC

DOI: 10.1186/s12911-022-01855-0

Keywords

Electronic health records; Machine learning; Clinical decision support; Surgical outcomes

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study demonstrates that accurately predicting longer length of stay is extremely challenging while shorter length of stay can be predicted accurately. Therefore, a two-stage model is opted for, first classifying patients into long versus short length of stays and then fitting a regressor among those predicted to have a short length of stay.

Background In the early stages of the COVID-19 pandemic our institution was interested in forecasting how long surgical patients receiving elective procedures would spend in the hospital. Initial examination of our models indicated that, due to the skewed nature of the length of stay, accurate prediction was challenging and we instead opted for a simpler classification model. In this work we perform a deeper examination of predicting in-hospital length of stay. Methods We used electronic health record data on length of stay from 42,209 elective surgeries. We compare different loss-functions (mean squared error, mean absolute error, mean relative error), algorithms (LASSO, Random Forests, multilayer perceptron) and data transformations (log and truncation). We also assess the performance of two stage hybrid classification-regression approach. Results Our results show that while it is possible to accurately predict short length of stays, predicting longer length of stay is extremely challenging. As such, we opt for a two-stage model that first classifies patients into long versus short length of stays and then a second stage that fits a regresssor among those predicted to have a short length of stay. Discussion The results indicate both the challenges and considerations necessary to applying machine-learning methods to skewed outcomes. Conclusions Two-stage models allow those developing clinical decision support tools to explicitly acknowledge where they can and cannot make accurate predictions.

Predicting in-hospital length of stay: a two-stage modeling approach to account for highly skewed data

Journal

BMC MEDICAL INFORMATICS AND DECISION MAKING

Publisher

BMC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Predicting in-hospital length of stay: a two-stage modeling approach to account for highly skewed data

Journal

BMC MEDICAL INFORMATICS AND DECISION MAKING

Publisher

BMC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper