☆ 4.5 Article

The prediction error in CLS and PLS: the importance of feature selection prior to multivariate calibration

JOURNAL OF CHEMOMETRICS (2005)

期刊

JOURNAL OF CHEMOMETRICS

卷 19, 期 2, 页码 107-118

出版社

WILEY

DOI: 10.1002/cem.915

关键词

classical least squares; partial least squares; prediction error; dimensional reduction; feature selection

类别

Automation & Control Systems Chemistry, Analytical Computer Science, Artificial Intelligence Instruments & Instrumentation Mathematics, Interdisciplinary Applications Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Classical least squares (CLS) and partial least squares (PLS) are two common multivariate regression algorithms in chemometrics. This paper presents an asymptotically exact mathematical analysis of the mean squared error of prediction of CLS and PLS under the linear mixture model commonly assumed in spectroscopy. For CLS regression with a very large calibration set the root mean squared error is approximately equal to the noise per wavelength divided by the length of the net analyte signal vector. It is shown, however, that for a finite training set with n samples in p dimensions there are additional error terms that depend on a sigma(2)p(2)/n(2), where sigma is the noise level per co-ordinate. Therefore in the 'large p-small n' regime, common in spectroscopy, these terms can be quite large and even dominate the overall prediction error. It is demonstrated both theoretically and by simulations that dimensional reduction of the input data via their compact representation with a few features, selected for example by adaptive wavelet compression, can substantially decrease these effects and recover the asymptotic error. This analysis provides a theoretical justification for the need to perform feature selection (dimensional reduction) of the input data prior to application of multivariate regression algorithms. Copyright (C) 2005 John Wiley & Sons, Ltd.

The prediction error in CLS and PLS: the importance of feature selection prior to multivariate calibration

期刊

JOURNAL OF CHEMOMETRICS

出版社

WILEY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

The prediction error in CLS and PLS: the importance of feature selection prior to multivariate calibration

期刊

JOURNAL OF CHEMOMETRICS

出版社

WILEY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文