4.7 Article

Bioinformatic Extraction of Functional Genetic Diversity from Heterogeneous Germplasm Collections for Crop Improvement

期刊

AGRONOMY-BASEL
卷 10, 期 4, 页码 -

出版社

MDPI
DOI: 10.3390/agronomy10040593

关键词

ex situ conservation; core collection; Gene Ontology; genome wide association; machine learning; natural language processing; SNP

资金

  1. United States Department of Agriculture, Agricultural Research Service [3012-21000-015-00-D]

向作者/读者索取更多资源

Efficient utilization of genetic variation in plant germplasm collections is impeded by large collection size, uneven characterization of traits, and unpredictable apportionment of allelic diversity among heterogeneous accessions. Distributing compact subsets of the complete collection that contain maximum allelic diversity at functional loci of interest could streamline conventional and precision breeding. Using heterogeneous population samples from Arabidopsis, Populus and sorghum, we show that genomewide single nucleotide polymorphism (SNP) data permits the capture of 3-78 fold more haplotypic diversity in subsets than geographic or environmental data, which are commonly used surrogate predictors of genetic diversity. Using a large genomewide SNP data set from landrace sorghum, we demonstrate three bioinformatic approaches to extract functional genetic diversity. First, in a candidate gene approach, we assembled subsets that maximized haplotypic diversity at 135 putative lignin biosynthetic loci, relevant to biomass breeding programs. Secondly, we applied a keyword search against the Gene Ontology to identify 1040 regulatory loci and assembled subsets capturing genomewide regulatory gene diversity, a general source of phenotypic variation. Third, we developed a machine-learning approach to rank semantic similarity between Gene Ontology term definitions and the textual content of scientific publications on crop adaptation to climate, a complex breeding objective. We identified 505 sorghum loci whose defined function is semantically-related to climate adaptation concepts. The assembled subsets could be used to address climatic pressures on sorghum production. To face impending agricultural challenges and foster rapid extraction and use of novel genetic diversity resident in heterogeneous germplasm collections, whole genome resequencing efforts should be prioritized.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据