4.7 Article

Identifying causes of crop yield variability with interpretive machine learning

期刊

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.compag.2021.106632

关键词

Yield modelling; Soil constraints; Precision agriculture; Digital agriculture; Digital soil mapping

资金

  1. Cubbie Agriculture
  2. GRDC

向作者/读者索取更多资源

Machine learning methods have been widely used for crop yield modeling and forecasting, but there has been limited application in understanding site-specific yield constraints. More quantitative and systematic approaches are needed to identify and understand the causes of variation in crop yield in order to implement appropriate management responses.
Machine learning approaches have been widely used for crop yield modelling and yield forecasting but there has been limited application to understanding site-specific yield constraints. Crop yield is driven by a complex interaction of spatial and temporal variables, which makes it challenging to define the exact cause of observed spatial yield variability explicitly. This makes it difficult to design efficient management strategies to address production constraints. There is a need for a more quantitative and systematic approach to identify and understand the causes of variation in crop yield in order to implement appropriate management responses. This study investigated the use of interpretive machine learning (IML) to address this need. The developed methodology was demonstrated on furrow-irrigated cotton fields totalling similar to 2000 ha in the Condamine-Balonne River catchment, Australia. Digital soil maps of important soil constraints were created at 20 m spatial resolution using 70 soil cores extracted to 1.4 m depth and a combination of on-farm and off-farm spatial data layers. Specifically, the soil constraints represented were exchangeable sodium percentage (ESP - sodicity), pH (alkalinity), and electrical conductivity (ECe- salinity). Terrain infrastructure variable maps of closed depressions, distance down furrow, and cut and fill (from landforming practices) were also developed. Empirical models of cotton lint yield were created with gradient boosted decision trees (XGBoost) using the digital soil maps and terrain infrastructure data as predictor variables. The models could describe the spatial variation in yield well, with a median Lin's concordance correlation coefficient of 0.67 and root-mean-square error of 0.75b ha(-1). SHapley Additive explanations (SHAP), an IML approach based on game theory, was then used to identify the contribution of each variable to the modelled yield across the study area. The variable most decreasing yield at each point was identified and mapped across the study area, and the spatial extent represented by each variable quantified. The SHAP values for each predictor variable were also extracted and mapped for a case study field, which demonstrated the magnitude of the impact of each variable on yield with spatial context in easily interpretable units (b ha(-1)). The presented methodology is promising for cost-benefit analysis of implementing remediation strategies, or where not economically feasible, altering management inputs according to a constrained yield potential.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据