4.7 Article

Predicting phenotypes from genetic, environment, management, and historical data using CNNs

Journal

THEORETICAL AND APPLIED GENETICS
Volume 134, Issue 12, Pages 3997-4011

Publisher

SPRINGER
DOI: 10.1007/s00122-021-03943-7

Keywords

-

Funding

  1. United States National Science Foundation (NSF) Postdoctoral Research Fellowship in Biology [1710618]
  2. US Department of Agriculture, Agricultural Research Service
  3. Division Of Integrative Organismal Systems
  4. Direct For Biological Sciences [1710618] Funding Source: National Science Foundation

Ask authors/readers for more resources

Convolutional Neural Networks (CNNs) can match or outperform standard genomic prediction methods in predicting agronomic yield when sufficient genetic, environmental, and management data are provided. They allow the data itself to determine the important factors and have shown higher accuracy compared to traditional methods, especially when all factors are considered.
Key Message Convolutional Neural Networks (CNNs) can perform similarly or better than standard genomic prediction methods when sufficient genetic, environmental, and management data are provided. Predicting phenotypes from genetic (G), environmental (E), and management (M) conditions is a long-standing challenge with implications to agriculture, medicine, and conservation. Most methods reduce the factors in a dataset (feature engineering) in a subjective and potentially oversimplified manner. Deep neural networks such as Multilayer Perceptrons (MPL) and Convolutional Neural Networks (CNN) can overcome this by allowing the data itself to determine which factors are most important. CNN models were developed for predicting agronomic yield from a combination of replicated trials and historical yield survey data. The results were more accurate than standard methods when tested on held-out G, E, and M data (r = 0.50 vs. r = 0.43), and performed slightly worse than standard methods when only G was held out (r = 0.74 vs. r = 0.80). Pre-training on historical data increased accuracy compared to trial data alone. Saliency map analysis indicated the CNN has learned to prioritize many factors of known agricultural importance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available