4.6 Article

Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY
卷 132, 期 -, 页码 88-96

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jclinepi.2020.12.005

关键词

Risk prediction models; Penalization; Shrinkage; Overfitting; Sample size

资金

  1. Cancer Research UK [C49297/A27294]
  2. NIHR Biomedical Research Centre, Oxford
  3. National Institute for Health Research School for Primary Care Research (NIHR SPCR)
  4. National Institute for Health Research (NIHR)
  5. MRC-NIHR Methodology Research Program [MR/T025085/1]

向作者/读者索取更多资源

When developing a clinical prediction model, penalization techniques may be unreliable due to large uncertainty in estimated tuning parameters, especially with small effective sample sizes and low model overfitting levels. This uncertainty can result in considerable miscalibration of model predictions for new individuals.
Objectives: When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms (?tuning parameters?) are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance. Study Design and Setting: This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net. Results: In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model?s Cox-Snell R2 is low. The problem can lead to considerable miscalibration of model predictions in new individuals. Conclusion: Penalization methods are not a ?carte blanche?; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters. ? 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据