4.4 Article

Variable Selection in Visible and Near-Infrared Spectral Analysis for Noninvasive Determination of Soluble Solids Content of 'Ya' Pear

期刊

FOOD ANALYTICAL METHODS
卷 7, 期 9, 页码 1891-1902

出版社

SPRINGER
DOI: 10.1007/s12161-014-9832-8

关键词

Near infrared spectroscopy; Monte Carlo-uninformative variable elimination; Successive projections algorithm; Variable selection; Soluble solids content; 'Ya' pear

资金

  1. National Natural Science Foundation of China [31301236]
  2. China Postdoctoral Science Foundation [2012M520193]
  3. Postdoctoral Science Foundation of Beijing of China [2013ZZ-70]

向作者/读者索取更多资源

Informative variable selection or wavelength selection plays an important role in the quantitative analysis of near-infrared (NIR) spectra because the modern spectroscopy instrumentations usually have a high resolution and the obtained spectral data sets may have thousands of variables and hundreds or thousands of samples. In this study, a new combination of Monte Carlo-uninformative variable elimination (MC-UVE) and successive projections algorithm (SPA; MC-UVE-SPA) was proposed to select the most effective variables. MC-UVE was firstly used to eliminate the uninformative variables in the raw spectra data. Then, SPA was applied to determine the variables with the least collinearity. A case study was done based on the NIR spectroscopy for the non-destructive determination of soluble solids content (SSC) in 'Ya' pear. A total of 160 samples were prepared for the calibration (n = 120) and prediction (n = 40) sets. Three calibration algorithms including linear regressions of partial least square regression (PLS) and multiple linear regression (MLR), and nonlinear regression of least-square support vector machine (LS-SVM) were used for model establishment by using the selected variables by SPA, UVE, MC-UVE, UVE-SPA, and MC-UVE-SPA, respectively. The results indicated that linear models such as PLS and MLR were more effective than nonlinear model such as LS-SVM in the prediction of SSC of 'Ya' pear. In terms of linear models, different variable selection methods can obtain a similar result with the RMSEP values range from 0.2437 to 0.2830. However, combination of MC-UVE and SPA was helpful for obtaining a more parsimonious and efficient model for predicting the SSC values in 'Ya' pear. Twenty-two effective variables selected by MC-UVE-SPA achieved the optimal linear MC-UVE-SPA-MLR model compared with other all developed models by balancing between model accuracy and model complexity. The coefficients of determination (r (2)), root mean square error of prediction, and residual predictive deviation by MC-UVE-SPA-MLR were 0.9271, 0.2522, and 3.7037, respectively.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据