4.7 Article

An improved successive projections algorithm version to variable selection in multiple linear regression

期刊

ANALYTICA CHIMICA ACTA
卷 1274, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.aca.2023.341560

关键词

Variable selection; Successive projections algorithm; Multilinear regression; Partial least squares; NIR spectrometry

向作者/读者索取更多资源

This study proposes a new algorithm called fSPA-MLR, which enhances the performance of the original SPA-MLR method by adding a filter step to reduce the number of uninformative variables. The fSPA-MLR models demonstrate superior performance compared to PLS and the original SPA-MLR models in both cross-validation and external prediction.
The aim of the successive projections algorithm (SPA) is to enhance the accuracy of multiple linear regressions (MLR) by minimizing the impact of collinearity effects in the calibration data set. Combining SPA with MLR as a variable selection approach has resulted in the SPA-MLR method, which has been reported in literature to produce models with good prediction ability compared to conventional full-spectrum models obtained with partial-least-squares (PLS) in some cases. This paper proposes the addition of a filter step to the current version of the SPA algorithm to reduce the number of uninformative variables before the projection phase and assist the algorithm in selecting the best variables on subsequent steps. The proposed fSPA-MLR algorithm is evaluated in two case studies involving the near-infrared spectrometric analysis of pharmaceutical tablet and diesel/biodiesel mixture samples. Compared to PLS, the fSPA-MLR models demonstrate similar or better performance. Moreover, the fSPA-MLR models outperform the original SPA-MLR in both cross-validation and external prediction. The fSPA-MLR models deliver superior results regardless of the pre-processing algorithm tested, including firstderivative Savitzky-Golay (SG) and Standard Normal Variate (SNV), or even in raw spectra data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据