4.7 Article

Towards Optimal Variable Selection Methods for Soil Property Prediction Using a Regional Soil Vis-NIR Spectral Library

期刊

REMOTE SENSING
卷 15, 期 2, 页码 -

出版社

MDPI
DOI: 10.3390/rs15020465

关键词

proximal soil sensing; partial least squares regression; Cubist; random forests; forward recursive feature selection

向作者/读者索取更多资源

Soil visible and near-infrared (Vis-NIR, 350-2500 nm) spectroscopy has been proven as an alternative to conventional laboratory analysis. The study evaluated seven variable selection algorithms and three predictive algorithms in predicting soil properties using a regional soil Vis-NIR spectral library. The results showed that Cubist outperformed partial least squares regression (PLSR) and random forests (RF) in most soil properties when using the full spectra. The study provides valuable insights for predicting soil information using spectroscopic techniques and variable selection algorithms.
Soil visible and near-infrared (Vis-NIR, 350-2500 nm) spectroscopy has been proven as an alternative to conventional laboratory analysis due to its advantages being rapid, cost-effective, non-destructive and environmentally friendly. Different variable selection methods have been used to deal with the high redundancy, heavy computation, and model complexity of using full spectra in spectral modelling. However, most previous studies used a linear algorithm in the variable selection, and the application of a non-linear algorithm remains poorly explored. To address the current knowledge gap, based on a regional soil Vis-NIR spectral library (1430 soil samples), we evaluated seven variable selection algorithms together with three predictive algorithms in predicting seven soil properties. Our results showed that Cubist overperformed partial least squares regression (PLSR) and random forests (RF) in most soil properties (R-2 > 0.75 for soil organic matter, total nitrogen and pH) when using the full spectra. Most of variable selection can greatly reduce the number of spectral bands and therefore simplified predictive models without losing accuracy. The results also showed that there was no silver bullet for the optimal variable selection algorithm among different predictive algorithms: (1) competitive adaptive reweighted sampling (CARS) always performed best for the PLSR algorithm, followed by forward recursive feature selection (FRFS); (2) recursive feature elimination (RFE) and genetic algorithm (GA) generally had better accuracy than others for the Cubist algorithm; and (3) FRFS had the best model performance for the RF algorithm. In addition, the performance was generally better when the algorithm used in the variable selection matched the predictive algorithm. The outcome of this study provides a valuable reference for predicting soil information using spectroscopic techniques together with variable selection algorithms.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据