4.7 Article

Improving the spatial prediction of soil organic carbon using environmental covariates selection: A comparison of a group of environmental covariates

Journal

CATENA
Volume 208, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.catena.2021.105723

Keywords

Soil organic carbon; Spatial prediction; Environmental covariates; Time-series vegetation indices; Machine learning; Uncertainty

Funding

  1. National Key Research and Development Program of China [2017YFA0604302, 2018YFA0606500]
  2. Isfahan University of Technology
  3. Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy -EXC [2064/1, 390727645, SFB 1070, 215859406]

Ask authors/readers for more resources

In this study, the performance of predicting soil organic carbon (SOC) in an arid agroecosystem in Iran using different datasets and machine learning algorithms was compared. The results showed that the Cubist model performed the best with the MCC dataset and the combined dataset of MCC and remote sensing time-series (RST), while the RF model showed better results for the RST dataset. Soil properties were found to be the main factors influencing SOC variation in the MCC and combined datasets, while NDVI was the most controlling factor in the RST dataset. The study suggested that time-series vegetation indices may not significantly improve SOC prediction accuracy, but combining MCC and RST datasets could produce SOC spatial maps with lower uncertainty.
In the digital soil mapping (DSM) framework, machine learning models quantify the relationship between soil observations and environmental covariates. Generally, the most commonly used covariates (MCC; e.g., topographic attributes and single-time remote sensing data, and legacy maps) were employed in DSM studies. Additionally, remote sensing time-series (RST) data can provide useful information for soil mapping. Therefore, the main aims of the study are to compare the MCC, the monthly Sentinel-2 time-series of vegetation indices dataset, and the combination of datasets (MCC + RST) for soil organic carbon (SOC) prediction in an arid agroecosystem in Iran. We used different machine learning algorithms, including random forest (RF), Cubist, support vector machine (SVM), and partial least square regression (PLSR). A total of 237 soil samples at 0-20 cm depths were collected. The 5-fold cross-validation technique was used to evaluate the modeling performance, and 50 bootstrap models were applied to quantify the prediction uncertainty. The results showed that the Cubist model performed the best with the MCC dataset (R-2 = 0.35, RMSE = 0.26%) and the combined dataset of MCC and RST (R-2 = 0.33, RMSE = 0.27%), while the RF model showed better results for the RST dataset (R-2 = 0.10, RMSE = 0.31%). Soil properties could explain the SOC variation in MCC and combined datasets (66.35% and 50.82%, respectively), while NDVI was the most controlling factor in the RST (50.22%). Accordingly, results showed that time-series vegetation indices did not have enough potential to increase SOC prediction accuracy. However, the combination of MCC and RST datasets produced SOC spatial maps with lower uncertainty. Therefore, future studies are required to explicitly explain the efficiency of time-series remotely-sensed data and their interrelationship with environmental covariates to predict SOC in arid regions with low SOC content.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available