4.7 Article Proceedings Paper

Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation

Journal

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
Volume 82, Issue 1-2, Pages 83-89

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.chemolab.2005.07.004

Keywords

Monte Carlo cross validation; leave-one-out cross validation; cross validation; partial least squares; near-infrared spectra

Ask authors/readers for more resources

Monte Carlo cross validation (MCCV) is used in two data sets including 125 and 1643 near-infrared (NIR) spectra of biological samples, respectively, to ascertain the number of samples left out for validation in MCCV and the dimension of PLS models consequently. With the selected number of samples in validation set, the suitable number of latent variables (LV) may be chosen correctly. The results obtained show that root mean squared error of calibration (RMSEC), root mean squared error of cross validation (RMSECV) and LV number are sensitive to the number of samples left out for validation when too many samples are left out. Based on this, RMSEC and RMSECV are suggested as criteria to assist the ascertainment of the number of samples left out for validation in MCCV. This method is easy and convenient to use. For a larger data set, more samples may be left out, but the suitable number of samples left out will decrease if the measurement error level is high. (c) 2005 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available