4.6 Article

An integrated approach to geographic validation helped scrutinize prediction model performance and its variability

Journal

JOURNAL OF CLINICAL EPIDEMIOLOGY
Volume 157, Issue -, Pages 13-21

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.jclinepi.2023.02.021

Keywords

Calibration; Discrimination; Multicenter; Subgroup discovery; Prediction models; Heterogeneity

Ask authors/readers for more resources

This study aims to validate prediction models for 30-day mortality in transcatheter aortic valve implantation (TAVI) using multicenter data. The study found that the predictive performance varied among hospitals, with a range of AUCs and miscalibration in some hospitals. Case mix differences between hospitals were substantial, indicating low model transportability.
Objectives: To illustrate in-depth validation of prediction models developed on multicenter data.Methods: For each hospital in a multicenter registry, we evaluated predictive performance of a 30-day mortality prediction model for transcatheter aortic valve implantation (TAVI) using the Netherlands heart registration (NHR) dataset. We measured discrimination and calibration per hospital in a leave-center-out analysis (LCOA). Meta-analysis was used to calculate I2 values per performance metric from the LCOA and to compute mean and confidence interval (CI) estimates. Case mix differences between studies were inspected using the framework of Debray et al. for understanding external validation. We also aimed to discover subgroups (SGs) with high model prediction error (PE) and their distribution over the centers.Results: We studied 16 hospitals with 11,599 TAVI patients with an early mortality of 3.7%. The models' area under the curve (AUCs) had a wide range between hospitals from 0.59 to 0.79, and miscalibration occurred in seven hospitals. Mean AUC from meta-analysis was 0.68 (95% CI 0.65-0.70). I2 values were 0%, 74%, and 0% for AUC, calibration intercept and slope, respectively. Between-hospital case -mix differences were substantial, and model transportability was low. One SG was discovered with marked global PE and was associated with poor performance on validation centers.Conclusion: The illustrated combination of approaches provides useful insights to inspect multicenter-based prediction models, and it exposes their limitations in transportability and performance variability when applied to different populations. (c) 2023 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available