4.7 Article

Multivariate random forest for digital soil mapping

Journal

GEODERMA
Volume 431, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.geoderma.2023.116365

Keywords

Digital soil mapping; Random forest; Stochastic simulation; Regression co-kriging; Soil organic carbon

Categories

Ask authors/readers for more resources

In digital soil mapping, traditional univariate methods neglect the dependence structure between soil properties, while multivariate machine learning models can capture complex non-linear relationships and maintain the dependence structure. This study compares the performance of a multivariate random forest model with two separate univariate random forest models, and finds that the multivariate model outperforms in maintaining the dependence structure and producing more realistic results.
In digital soil mapping (DSM), soil maps are usually produced in a univariate manner, that is, each soil map is produced independently and therefore, when multiple soil properties are mapped the underlying dependence structure between these soil properties is ignored. This may lead to inconsistent predictions and simulations. For example, soil organic carbon (SOC) and total nitrogen (TN) maps produced independently may show unrealistic carbon-nitrogen (C:N) ratios. In the last decade the production of soil maps with machine learning models has become increasingly popular as these models are able to capture complex non-linear relationships between soil properties and environmental covariates. However, producing soil maps with multivariate machine learning models is still lacking and requires much investigation in DSM. In this paper we present the combined modelling of multiple soil properties with a multivariate random forest (MRF) model. We applied this model to mapping SOC and TN, and we compared it with results of two separate univariate random forest (RF) models. The comparison was done by means of stochastic simulations determined by sampling from the conditional distributions of the soil properties, given the covariates, as estimated by quantile regression forest. The results show that the MRF model is superior in terms of maintaining the dependence structure between SOC and TN, and consequently, is also able to produce more realistic C:N ratios. The models were also compared on the basis of prediction accuracy using commonly used accuracy metrics such as the root mean square error (RMSE). We found that the accuracy of the MRF model (RMSE-SOC = 40.04, RMSE-TN = 2.26, RMSE-CN = 3.58) is comparable to that of the univariate RF models (RMSE-SOC = 39.76, RMSE-TN = 2.26, RMSE-CN = 3.65). We performed the same comparisons between a regression co-kriging model and two separate regression kriging models, and made similar conclusions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available