☆ 4.3 Article

External validation: a simulation study to compare cross-validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients

EJNMMI RESEARCH (2022)

期刊

EJNMMI RESEARCH

卷 12, 期 1, 页码 -

出版社

SPRINGER

DOI: 10.1186/s13550-022-00931-w

关键词

Internal validation; External validation; Model performance; CV-AUC

类别

Radiology, Nuclear Medicine & Medical Imaging

资金

Dutch Cancer Society [VU 2018-11648]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study compared various internal and external validation approaches for clinical prediction models, finding that cross-validation and holdout yielded comparable results, but holdout had higher uncertainty. Bootstrapping had slightly lower performance. Increasing test set size improved precision and reducing variability. Model performance varied with different stages in test datasets. Adjusting high-risk cutoffs and error rates impacted performance, as indicated by calibration slope. EARL2 dataset showed similar performance with possible overfitting.

Aim Clinical prediction models need to be validated. In this study, we used simulation data to compare various internal and external validation approaches to validate models. Methods Data of 500 patients were simulated using distributions of metabolic tumor volume, standardized uptake value, the maximal distance between the largest lesion and another lesion, WHO performance status and age of 296 diffuse large B cell lymphoma patients. These data were used to predict progression after 2 years based on an existing logistic regression model. Using the simulated data, we applied cross-validation, bootstrapping and holdout (n = 100). We simulated new external datasets (n = 100, n = 200, n = 500) and simulated stage-specific external datasets (1), varied the cut-off for high-risk patients (2) and the false positive and false negative rates (3) and simulated a dataset with EARL2 characteristics (4). All internal and external simulations were repeated 100 times. Model performance was expressed as the cross-validated area under the curve (CV-AUC +/- SD) and calibration slope. Results The cross-validation (0.71 +/- 0.06) and holdout (0.70 +/- 0.07) resulted in comparable model performances, but the model had a higher uncertainty using a holdout set. Bootstrapping resulted in a CV-AUC of 0.67 +/- 0.02. The calibration slope was comparable for these internal validation approaches. Increasing the size of the test set resulted in more precise CV-AUC estimates and smaller SD for the calibration slope. For test datasets with different stages, the CV-AUC increased as Ann Arbor stages increased. As expected, changing the cut-off for high risk and false positive- and negative rates influenced the model performance, which is clearly shown by the low calibration slope. The EARL2 dataset resulted in similar model performance and precision, but calibration slope indicated overfitting. Conclusion In case of small datasets, it is not advisable to use a holdout or a very small external dataset with similar characteristics. A single small testing dataset suffers from a large uncertainty. Therefore, repeated CV using the full training dataset is preferred instead. Our simulations also demonstrated that it is important to consider the impact of differences in patient population between training and test data, which may ask for adjustment or stratification of relevant variables.

External validation: a simulation study to compare cross-validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients

期刊

EJNMMI RESEARCH

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

External validation: a simulation study to compare cross-validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients

期刊

EJNMMI RESEARCH

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文