4.6 Article

Internal validation of predictive models: Efficiency of some procedures for logistic regression analysis

期刊

JOURNAL OF CLINICAL EPIDEMIOLOGY
卷 54, 期 8, 页码 774-781

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/S0895-4356(01)00341-9

关键词

predictive models; internal validation; logistic regression analysis; bootstrapping

向作者/读者索取更多资源

The performance of a predictive model is overestimated when simply determined on the sample of subjects that was used to construct the model. Several internal validation methods are available that aim to provide a more accurate estimate of model performance in new subjects. We evaluated several variants of split-sample, cross-validation and bootstrapping methods with a logistic regression model that included eight predictors for 30-day mortality after an acute myocardial infarction. Random samples with a size between,n = 572 and n = 9165 were drawn from a large data set (GUSTO-I; n = 40,830; 2851 deaths) to reflect modeling in data sets with between 5 and 80 events per variable. Independent performance was determined on the remaining subjects. Performance measures included discriminative ability, calibration and overall accuracy. We found that split-sample analyses gave overly pessimistic estimates of performance with large variability. Cross-validation on 10% of the sample had low bias and low variability, but was not suitable fur all performance measures. Internal validity could best be estimated with bootstrapping, which provided stable estimates with low bias. We conclude that split-sample validation is inefficient, and recommend bootstrapping for estimation of internal validity of a predictive logistic regression model. (C) 2001 Elsevier Science Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据