4.7 Article

Improving pedotransfer functions for predicting soil mineral associated organic carbon by ensemble machine learning

Journal

GEODERMA
Volume 428, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.geoderma.2022.116208

Keywords

Soil carbon fractions; Model ensemble; LUCAS Soil; Forward recursive feature selection

Categories

Funding

  1. National Natural Science Founda- tion of China
  2. Ten -thousand Talents Plan of Zhejiang Province, China
  3. Project of Department of Education Science and Technology of Jiangxi Province
  4. Social Science Foundation of Jiangxi Province
  5. LE STUDIUM Institute of Advanced Research Studies, France
  6. [41930754]
  7. [42001047]
  8. [41901055]
  9. [2019R52004]
  10. [GJJ210541]
  11. [21YJ43D]

Ask authors/readers for more resources

This study evaluated the potential of machine learning methods in predicting mineral-associated organic carbon (MAOC) in soil. The results showed that machine learning-based predictive models can accurately predict MAOC, and the use of feature selection methods can optimize model performance and simplify model structure. Additionally, the study found that model ensemble methods can improve the accuracy and robustness of predictive models.
Soil organic carbon (SOC) sequestration is a promising natural climate solution for capturing atmospheric CO2, and it provides crucial co-benefits in improving soil functions and services at the same time. Given that SOC is not a single and uniform entity, a deep understanding of SOC response to environmental changes requires additional information on SOC fractions with distinct characteristics such as particulate organic carbon (POC) and mineral associated organic carbon (MAOC). Despite their great importance, POC and MAOC information is still scarce in the soil databases, particularly on a broad scale. Pedotransfer function (PTF) is a good strategy to estimate missing soil properties, while its application in SOC fractions has been poorly explored. Based on 352 representative mineral topsoil samples (0-20 cm) across Europe, we evaluated the potential of MAOC prediction using machine learning based PTF (random forest (RF), Cubist, and gradient boosted machine (GBM)) together with predictor selection methods (recursive feature elimination (RFE) and forward recursive feature selection (FRFS)). The repeated validation (100 times) showed that MAOC could be well predicted by machine learning based PTFs (R2 of 0.877-0.9, RMSE of 2.994-3.269 g kg- 1). RFE can effectively reduce the number of predictors from 21 to 12 with comparable performance to the models using all predictors. The proposed FRFS algorithm had the best model parsimony with only 6 predictors (SOC, silt + clay, nitrogen, nitrogen deposition, soil erosion and sand) and performed similar to or even better than RFE. In combination with FRFS, Cubist performed best among the three machine learning models (R2 of 0.9, RMSE of 2.994 g kg- 1). Our results also showed that five model ensemble methods had similar model performance and can improve model accuracy and robustness compared to a single machine learning model. This study provides a valuable reference for coupling PTF and legacy soil databases to increase the spatial coverage and the performance of machine learning based SOC fraction predictions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available