4.7 Article

Fused stagewise regression - A waveband selection algorithm for spectroscopy

Journal

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
Volume 149, Issue -, Pages 53-65

Publisher

ELSEVIER
DOI: 10.1016/j.chemolab.2015.09.004

Keywords

Robust variable selection; Wavelength selection; Fused Lasso; Double repeated cross validation; Spectroscopy

Funding

  1. COMET programme within the research network Process Analytical Chemistry (PAC) of the Austrian research funding association (FFG) [825340]

Ask authors/readers for more resources

While partial least squares (PLS) and principal component regression (PCR), the most popular regression techniques in chemometrics, may theoretically be able to deal with large numbers of possibly correlated variables, as occurring in the analysis of spectroscopic data, the importance of performing some form of variable selection in practical applications has been widely discussed and acknowledged. In this work we address this problem via proposing a sparse regression algorithm, referred to as fused stagewise regression (FSR), which iteratively performs a selection of connected regions of variables (wavelengths), while being quite easy to implement and interpret, due to its resemblance to typical steps in iterative manual feature selection procedures. We evaluate the proposed variable selection technique on a publicly available benchmark data set and compare the performance of PLS models built on the determined selection to ones obtained by state-of-the-art feature selection methods from the fields of chemometrics and machine learning. In order to ensure robust feature selection, we integrate the individual selection methods into an extensive repeated cross validation procedure. For the data set under investigation, it is shown that FSR performs at least as good as state-of-the-art approaches and well within the range of variable selections provided by experts. (C) 2015 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available