☆ 4.6 Article

A tutorial on variable selection for clinical prediction models: feature selection methods in data mining could improve the results

JOURNAL OF CLINICAL EPIDEMIOLOGY (2016)

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY

卷 71, 期 -, 页码 76-85

出版社

ELSEVIER SCIENCE INC

DOI: 10.1016/j.jclinepi.2015.10.002

关键词

Data musing; Variable selection; Feature selection; Methods; Prediction; Statistical model

类别

Health Care Sciences & Services Public, Environmental & Occupational Health

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Objectives: Identifying an appropriate set of predictors for the outcome of interest is a major challenge in clinical prediction research. The aim of this study was to show the application of some variable selection methods, usually used in data mining, for an epidemiological study. We introduce here a systematic approach. Study Design and Setting: The P-value-based method, usually used in epidemiological studies, and several filter and wrapper methods were implemented to select the predictors of diabetes among 55 variables in 803 prediabetic females, aged >= 20 years, followed for 10-12 years. To develop a logistic model, variables were selected from a train data set and evaluated on the test data set. The measures of Akaike information criterion (AIC) and area under the curve (AUC) were used as performance criteria. We also implemented a full model with all 55 variables. Results: We found that the worst and the best models were the full model and models based on the wrappers, respectively. Among filter methods, symmetrical uncertainty gave both the best AUC and AIC. Conclusion: Our experiment showed that the variable selection methods used in data mining could improve the performance of clinical prediction models.

A tutorial on variable selection for clinical prediction models: feature selection methods in data mining could improve the results

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY

出版社

ELSEVIER SCIENCE INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A tutorial on variable selection for clinical prediction models: feature selection methods in data mining could improve the results

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY

出版社

ELSEVIER SCIENCE INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文