4.7 Article

County-scale crop yield prediction by integrating crop simulation with machine learning models

Journal

FRONTIERS IN PLANT SCIENCE
Volume 13, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fpls.2022.1000224

Keywords

data integration; APSIM; ensemble model; spatial analysis; model transparency

Categories

Funding

  1. National Science Foundation [1830478]
  2. Plant Sciences Institute's Faculty Scholars program at Iowa State University
  3. Div Of Civil, Mechanical, & Manufact Inn
  4. Directorate For Engineering [1830478] Funding Source: National Science Foundation

Ask authors/readers for more resources

Crop yield prediction is a complex task due to interactions among various factors and uncertainty in input values. This study combines crop modeling with machine learning to predict maize yields in the US Corn Belt. The results show that the accuracy of prediction improves when crop modeling is coupled with machine learning. The ensemble model outperforms individual ML models, and the analysis reveals that low cropland ratios, soil input data, and extreme weather events contribute to high prediction errors.
Crop yield prediction is of great importance for decision making, yet it remains an ongoing scientific challenge. Interactions among different genetic, environmental, and management factors and uncertainty in input values are making crop yield prediction complex. Building upon a previous work in which we coupled crop modeling with machine learning (ML) models to predict maize yields for three US Corn Belt states, here, we expand the concept to the entire US Corn Belt (12 states). More specifically, we built five new ML models and their ensemble models, considering the scenarios with and without crop modeling variables. Additional input values in our models are soil, weather, management, and historical yield data. A unique aspect of our work is the spatial analysis to investigate causes for low or high model prediction errors. Our results indicated that the prediction accuracy increases by coupling crop modeling with machine learning. The ensemble model overperformed the individual ML models, having a relative root mean square error (RRMSE) of about 9% for the test years (2018, 2019, and 2020), which is comparable to previous studies. In addition, analysis of the sources of error revealed that counties and crop reporting districts with low cropland ratios have high RRMSE. Furthermore, we found that soil input data and extreme weather events were responsible for high errors in some regions. The proposed models can be deployed for large-scale prediction at the county level and, contingent upon data availability, can be utilized for field level prediction.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available