4.6 Article

Cross-Validation With Confidence

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
Volume 115, Issue 532, Pages 1978-1997

Publisher

TAYLOR & FRANCIS INC
DOI: 10.1080/01621459.2019.1672556

Keywords

Cross-validation; Hypothesis testing; Model selection; Overfitting; Tuning parameter selection

Funding

  1. NSF [DMS-1407771, DMS-1553884]

Ask authors/readers for more resources

Cross-validation is one of the most popular model and tuning parameter selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to overfit, due to the ignorance of the uncertainty in the testing sample. We develop a novel statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This method outputs a set of highly competitive candidate models containing the optimal one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for tuning parameter selection, the method can provide an alternative trade-off between prediction accuracy and model interpretability than existing variants of cross-validation. We demonstrate the performance of the proposed method in several simulated and real data examples. Supplemental materials for this article can be found online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available