4.3 Article

QTG-Finder2: A Generalized Machine-Learning Algorithm for Prioritizing QTL Causal Genes in Plants

Journal

G3-GENES GENOMES GENETICS
Volume 10, Issue 7, Pages 2411-2421

Publisher

OXFORD UNIV PRESS INC
DOI: 10.1534/g3.120.401122

Keywords

Sorghum bicolor; Setaria viridis; machine learning; quantitative trait loci; causal genes

Funding

  1. U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Genomic Science Program [DE-SC0018277, DE-SC0008769]

Ask authors/readers for more resources

Linkage mapping has been widely used to identify quantitative trait loci (QTL) in many plants and usually requires a time-consuming and labor-intensive fine mapping process to find the causal gene underlying the QTL. Previously, we described QTG-Finder, a machine-learning algorithm to rationally prioritize candidate causal genes in QTLs. While it showed good performance, QTG-Finder could only be used in Arabidopsis and rice because of the limited number of known causal genes in other species. Here we tested the feasibility of enabling QTG-Finder to work on species that have few or no known causal genes by using orthologs of known causal genes as the training set. The model trained with orthologs could recall about 64% of Arabidopsis and 83% of rice causal genes when the top 20% ranked genes were considered, which is similar to the performance of models trained with known causal genes. The average precision was 0.027 for Arabidopsis and 0.029 for rice. We further extended the algorithm to include polymorphisms in conserved non-coding sequences and gene presence/absence variation as additional features. Using this algorithm, QTG-Finder2, we trained and cross-validatedSorghum bicolorandSetaria viridismodels. TheS. bicolormodel was validated by causal genes curated from the literature and could recall 70% of causal genes when the top 20% ranked genes were considered. In addition, we applied theS. viridismodel and public transcriptome data to prioritize a plant height QTL and identified 13 candidate genes. QTL-Finder2 can accelerate the discovery of causal genes in any plant species and facilitate agricultural trait improvement.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available