☆ 4.3 Article

Stability of clinical prediction models developed using statistical or machine learning methods

BIOMETRICAL JOURNAL (2023)

期刊

BIOMETRICAL JOURNAL

卷 -, 期 -, 页码 -

出版社

WILEY

DOI: 10.1002/bimj.202200302

关键词

calibration; fairness; prediction model; stability; uncertainty

类别

Mathematical & Computational Biology Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Clinical prediction models estimate an individual's risk of a particular health outcome. Many models are developed using small datasets, leading to instability in the model and its predictions. Researchers should examine instability at the model development stage and propose instability plots and measures to assess model reliability and inform critical appraisal, fairness, and validation requirements.

Clinical prediction models estimate an individual's risk of a particular health outcome. A developed model is a consequence of the development dataset and model-building strategy, including the sample size, number of predictors, and analysis method (e.g., regression or machine learning). We raise the concern that many models are developed using small datasets that lead to instability in the model and its predictions (estimated risks). We define four levels of model stability in estimated risks moving from the overall mean to the individual level. Through simulation and case studies of statistical and machine learning approaches, we show instability in a model's estimated risks is often considerable, and ultimately manifests itself as miscalibration of predictions in new data. Therefore, we recommend researchers always examine instability at the model development stage and propose instability plots and measures to do so. This entails repeating the model-building steps (those used to develop the original prediction model) in each of multiple (e.g., 1000) bootstrap samples, to produce multiple bootstrap models, and deriving (i) a prediction instability plot of bootstrap model versus original model predictions; (ii) the mean absolute prediction error (mean absolute difference between individuals' original and bootstrap model predictions), and (iii) calibration, classification, and decision curve instability plots of bootstrap models applied in the original sample. A case study illustrates how these instability assessments help reassure (or not) whether model predictions are likely to be reliable (or not), while informing a model's critical appraisal (risk of bias rating), fairness, and further validation requirements.

Stability of clinical prediction models developed using statistical or machine learning methods

期刊

BIOMETRICAL JOURNAL

出版社

WILEY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Stability of clinical prediction models developed using statistical or machine learning methods

期刊

BIOMETRICAL JOURNAL

出版社

WILEY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文