4.6 Article

Evaluation of digital soil mapping approaches with large sets of environmental covariates

Journal

SOIL
Volume 4, Issue 1, Pages 1-22

Publisher

COPERNICUS GESELLSCHAFT MBH
DOI: 10.5194/soil-4-1-2018

Keywords

-

Categories

Funding

  1. Swiss National Science Foundation SNSF [NRP 68]
  2. Swiss Earth Observatory Network (SEON)
  3. University of Zurich Research Priority Program on Global Change and Biodiversity (URPP GCB)

Ask authors/readers for more resources

The spatial assessment of soil functions requires maps of basic soil properties. Unfortunately, these are either missing for many regions or are not available at the desired spatial resolution or down to the required soil depth. The field-based generation of large soil datasets and conventional soil maps remains costly. Meanwhile, legacy soil data and comprehensive sets of spatial environmental data are available for many regions. Digital soil mapping (DSM) approaches relating soil data (responses) to environmental data (covariates) face the challenge of building statistical models from large sets of covariates originating, for example, from airborne imaging spectroscopy or multi-scale terrain analysis. We evaluated six approaches for DSM in three study regions in Switzerland (Berne, Greifensee, ZH forest) by mapping the effective soil depth available to plants (SD), pH, soil organic matter (SOM), effective cation exchange capacity (ECEC), clay, silt, gravel content and fine fraction bulk density for four soil depths (totalling 48 responses). Models were built from 300-500 environmental covariates by selecting linear models through (1) grouped lasso and (2) an ad hoc stepwise procedure for robust external-drift kriging (georob). For (3) geoadditive models we selected penalized smoothing spline terms by component-wise gradient boosting (geoGAM). We further used two tree-based methods: (4) boosted regression trees (BRTs) and (5) random forest (RF). Lastly, we computed (6) weighted model averages (MAs) from the predictions obtained from methods 1-5. Lasso, georob and geoGAM successfully selected strongly reduced sets of covariates (subsets of 3-6 % of all covariates). Differences in predictive performance, tested on independent validation data, were mostly small and did not reveal a single best method for 48 responses. Nevertheless, RF was often the best among methods 1-5 (28 of 48 responses), but was outcompeted by MA for 14 of these 28 responses. RF tended to over-fit the data. The performance of BRT was slightly worse than RE GeoGAM performed poorly on some responses and was the best only for 7 of 48 responses. The prediction accuracy of lasso was intermediate. All models generally had small bias. Only the computationally very efficient lasso had slightly larger bias because it tended to under-fit the data Summarizing, although differences were small, the frequencies of the best and worst performance clearly favoured RF if a single method is applied and MA if multiple prediction models can be developed.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available