4.4 Review

Hospital Length of Stay Prediction Methods A Systematic Review

Journal

MEDICAL CARE
Volume 59, Issue 10, Pages 929-938

Publisher

LIPPINCOTT WILLIAMS & WILKINS
DOI: 10.1097/MLR.0000000000001596

Keywords

data analysis; epidemiology; heath service research; quality of care; decision-making

Funding

  1. European Research Ambition Pack 2018 grant by the French Auvergne-Rhone-Alpes region

Ask authors/readers for more resources

This systematic review aimed to establish the landscape of length of stay (LOS) prediction methods based on hospital data, with regression being the most popular method followed by machine learning and deep learning. The use of machine learning and deep learning methods has increased over the past decade, and there is a trend towards more rigorous validation methods such as test sets and cross-validation. The most commonly used performance metrics include R (2), mean squared error, and accuracy.
Objective: This systematic review sought to establish a picture of length of stay (LOS) prediction methods based on available hospital data and study protocols designed to measure their performance. Materials and Methods: An English literature search was done relative to hospital LOS prediction from 1972 to September 2019 according to the PRISMA guidelines. Articles were retrieved from PubMed, ScienceDirect, and arXiv databases. Information were extracted from the included papers according to a standardized assessment of population setting and study sample, data sources and input variables, LOS prediction methods, validation study design, and performance evaluation metrics. Results: Among 74 selected articles, 98.6% (73/74) used patients' data to predict LOS; 27.0% (20/74) used temporal data; and 21.6% (16/74) used the data about hospitals. Overall, regressions were the most popular prediction methods (64.9%, 48/74), followed by machine learning (20.3%, 15/74) and deep learning (17.6%, 13/74). Regarding validation design, 35.1% (26/74) did not use a test set, whereas 47.3% (35/74) used a separate test set, and 17.6% (13/74) used cross-validation. The most used performance metrics were R (2) (47.3%, 35/74), mean squared (or absolute) error (24.4%, 18/74), and the accuracy (14.9%, 11/74). Over the last decade, machine learning and deep learning methods became more popular (P=0.016), and test sets and cross-validation got more and more used (P=0.014). Conclusions: Methods to predict LOS are more and more elaborate and the assessment of their validity is increasingly rigorous. Reducing heterogeneity in how these methods are used and reported is key to transparency on their performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available