4.5 Article

Improving the prediction performance of a large tropical vis-NIR spectroscopic soil library from Brazil by clustering into smaller subsets or use of data mining calibration techniques

Journal

EUROPEAN JOURNAL OF SOIL SCIENCE
Volume 65, Issue 5, Pages 718-729

Publisher

WILEY
DOI: 10.1111/ejss.12165

Keywords

-

Categories

Funding

  1. FAPESP (Sao Paulo research Foundation)
  2. (CNPq) National Council for Scientific and Technological Development
  3. Swedish Farmers' Foundation for Agricultural Research
  4. Swedish Research Council Formas

Ask authors/readers for more resources

Effective agricultural planning requires basic soil information. In recent decades visible near-infrared diffuse reflectance spectroscopy (vis-NIR) has been shown to be a viable alternative for rapidly analysing soil properties. We studied 7172 samples of seven different soil types collected from several regions of Brazil and varying in organic matter (OM) (0.2-10.3%) and clay content (0.2-99.0%). The aim was to explore the possibility of enhancing the performance of vis-NIR data in predicting organic matter and clay content in this library by dividing it into smaller sub-libraries on the basis of their vis-NIR spectra. We used partial least square regression (PLSR) models on the sub-libraries and compared the results with PLSR and two non-linear calibration techniques, boosted regression trees (BT) and support vector machines (SVM) applied to the whole library. The whole library calibrations for clay performed well (ME (modelling efficiency) > 0.82; RMSE (root mean squared error) < 10.9%), reflecting the influence of the direct spectral responses of this property in the vis-NIR range. Calibrations for OM were reasonably good, especially in view of the very small variation in this property (ME > 0.60; RMSE < 0.55%). The best results were, however, found when dividing the large library into smaller subsets by using variation in the mean-normalized or first derivative spectra. This divided the global data set into clusters that were more uniform in mineralogy, regardless of geographical origin, and improved predictive performance. The best clustering method improved the RMSE in the validation to 8.6% clay and 0.47% OM, which corresponds to a 21% and 15% reduction, respectively, as compared with whole library PLSR. For the whole library, SVM performed almost equally well, reducing RMSE to 8.9% clay and 0.48% OM.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available