4.7 Article

Prediction of hospitalization due to heart diseases by supervised learning methods

Journal

INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS
Volume 84, Issue 3, Pages 189-197

Publisher

ELSEVIER IRELAND LTD
DOI: 10.1016/j.ijmedinf.2014.10.002

Keywords

Prevention; Predictive models; Hospitalization; Heart diseases; Machine learning; Electronic Health Records (EHRs)

Funding

  1. NSF [IIS-1237022, CNS-1239021]
  2. NIH/NIGMS [GM093147]
  3. ARO [W911NF-11-1-0227, W911NF-12-1-0390]
  4. ONR [N0001410-1-0952]
  5. Direct For Computer & Info Scie & Enginr
  6. Div Of Information & Intelligent Systems [1237022] Funding Source: National Science Foundation
  7. Directorate For Engineering
  8. Div Of Electrical, Commun & Cyber Sys [1239021] Funding Source: National Science Foundation

Ask authors/readers for more resources

Background: In 2008, the United States spent $2.2 trillion for healthcare, which was 15.5% of its GDP. 31% of this expenditure is attributed to hospital care. Evidently, even modest reductions in hospital care costs matter. A 2009 study showed that nearly $30.8 billion in hospital care cost during 2006 was potentially preventable, with heart diseases being responsible for about 31% of that amount. Methods: Our goal is to accurately and efficiently predict heart-related hospitalizations based on the available patient-specific medical history. To the best of our knowledge, the approaches we introduce are novel for this problem. The prediction of hospitalization is formulated as a supervised classification problem. We use de-identified Electronic Health Record (EHR) data from a large urban hospital in Boston to identify patients with heart diseases. Patients are labeled and randomly partitioned into a training and a test set. We apply five machine learning algorithms, namely Support Vector Machines (SVM), AdaBoost using trees as the weak learner, logistic regression, a naive Bayes event classifier, and a variation of a Likelihood Ratio Test adapted to the specific problem. Each model is trained on the training set and then tested on the test set. Results: All five models show consistent results, which could, to some extent, indicate the limit of the achievable prediction accuracy. Our results show that with under 30% false alarm rate, the detection rate could be as high as 82%. These accuracy rates translate to a considerable amount of potential savings, if used in practice. (C) 2014 Elsevier Ireland Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available