4.5 Article

Nonparametric density estimation by exact leave-p-out cross-validation

Journal

COMPUTATIONAL STATISTICS & DATA ANALYSIS
Volume 52, Issue 5, Pages 2350-2368

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.csda.2007.10.002

Keywords

cross-validation; delete-p cross-validation; density estimation; histogram; kernel; leave-p-out; multiple testing; quadratic risk; V-fold cross-validation

Ask authors/readers for more resources

The problem of density estimation is addressed by minimization of the L-2-risk for both histogram and kernel estimators. This quadratic risk is estimated by leave-p-out cross-validation (LPO), which is made possible thanks to closed formulas, contrary to common belief. The potential gain in the use of LPO with respect to V-fold cross-validation (V-fold) in terms of the bias-variance trade-off is highlighted. An exact quantification of this extra variability, induced by the preliminary random partition of the data in the V-fold, is proposed. Furthermore, exact expressions are derived for both the bias and the variance of the risk estimator with histograms. Plug-in estimates of these quantities are provided, while their accuracy is assessed thanks to concentration inequalities. An adaptive selection procedure for p in the case of histograms is subsequently presented. This relies on minimization of the mean square error of the LPO risk estimator. Finally a simulation study is carried out which first illustrates the higher reliability of the LPO with respect to the V-fold, and then assesses the behavior of the selection procedure. For instance optimality of leave-one-out (LOO) is shown, at least empirically, in the context of regular histograms. (c) 2007 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available