☆ 4.5 Article

Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap

COMPUTATIONAL STATISTICS & DATA ANALYSIS (2009)

Journal

COMPUTATIONAL STATISTICS & DATA ANALYSIS

Volume 53, Issue 11, Pages 3735-3745

Publisher

ELSEVIER SCIENCE BV

DOI: 10.1016/j.csda.2009.04.009

Keywords

Funding

Soongsil University Research Fund

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

We consider the accuracy estimation of a classifier constructed on a given training sample. The naive resubstitution estimate is known to have a downward bias problem. The traditional approach to tackling this bias problem is cross-validation. The bootstrap is another way to bring down the high variability of cross-validation. But a direct comparison of the two estimators, cross-validation and bootstrap, is not fair because the latter estimator requires much heavier computation. We performed an empirical study to compare the .632+ bootstrap estimator with the repeated 10-fold cross-validation and the repeated one-third holdout estimator. All the estimators were set to require about the same amount of computation. In the simulation study, the repeated 10-fold cross-validation estimator was found to have better performance than the .632+ bootstrap estimator when the classifier is highly adaptive to the training sample. We have also found that the .632+ bootstrap estimator suffers from a bias problem for large samples as well as for small samples. (C) 2009 Elsevier B.V. All rights reserved.

Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap

Journal

COMPUTATIONAL STATISTICS & DATA ANALYSIS

Publisher

ELSEVIER SCIENCE BV

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap

Journal

COMPUTATIONAL STATISTICS & DATA ANALYSIS

Publisher

ELSEVIER SCIENCE BV

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper