☆ 4.7 Article

The importance of choosing a proper validation strategy in predictive models. A tutorial with real examples

ANALYTICA CHIMICA ACTA (2023)

Journal

ANALYTICA CHIMICA ACTA

Volume 1275, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.aca.2023.341532

Keywords

Validation; Cross-validation; PLS-DA; Resampling; Permutation test; Jackknife; Bootstrap

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Machine learning is the art of using measurement data and predictive variables to forecast future events. However, the crucial stage of validation is often overlooked. This manuscript highlights the importance of data structure and demonstrates how easily models can be misleading without proper validation strategies.

Machine learning is the art of combining a set of measurement data and predictive variables to forecast future events. Every day, new model approaches (with high levels of sophistication) can be found in the literature. However, less importance is given to the crucial stage of validation. Validation is the assessment that the model reliably links the measurements and the predictive variables. Nevertheless, there are many ways in which a model can be validated and cross-validated reliably, but still, it may be a model that wrongly reflects the real nature of the data and cannot be used to predict external samples. This manuscript shows in a didactical manner how important the data structure is when a model is constructed and how easy it is to obtain models that look promising with wrong-designed cross-validation and external validation strategies. A comprehensive overview of the main validation strategies is shown, exemplified by three different scenarios, all of them focused on classification.

The importance of choosing a proper validation strategy in predictive models. A tutorial with real examples

Journal

ANALYTICA CHIMICA ACTA

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

The importance of choosing a proper validation strategy in predictive models. A tutorial with real examples

Journal

ANALYTICA CHIMICA ACTA

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper