☆ 4.6 Article

Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes

JOURNAL OF CLINICAL EPIDEMIOLOGY (2013)

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY

卷 66, 期 4, 页码 398-407

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.jclinepi.2012.11.008

关键词

Boosting; Classification trees; Bagging; Random forests; Classification; Regression trees; Support vector machines; Regression methods; Prediction; Heart failure

类别

Health Care Sciences & Services Public, Environmental & Occupational Health

资金

Institute for Clinical Evaluative Sciences (ICES)
Ontario Ministry of Health and Long-Term Care (MOHLTC)
Canadian Institutes of Health Research (CIHR) [MOP 86508]
Heart and Stroke Foundation
Canada Research Chair in Health Services Research
Career Investigator Award from the Heart and Stroke Foundation
CIHR Team Grant in Cardiovascular Outcomes Research

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Objective: Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine-learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study Design and Setting: We compared the performance of these classification methods with that of conventional classification trees to classify patients with heart failure (HF) according to the following subtypes: HF with preserved ejection fraction (HFPEF) and HF with reduced ejection fraction. We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results: We found that modern, flexible tree-based methods from the data-mining literature offer substantial improvement in prediction and classification of HF subtype compared with conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared with the methods proposed in the data-mining literature. Conclusion: The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying HF subtypes in a population-based sample of patients from Ontario, Canada. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. (C) 2013 Elsevier Inc. All rights reserved.

Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY

出版社

ELSEVIER SCIENCE INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文