4.7 Article

Evidence-based gene predictions in plant genomes

期刊

GENOME RESEARCH
卷 19, 期 10, 页码 1912-1923

出版社

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT
DOI: 10.1101/gr.088997.108

关键词

-

资金

  1. National Science Foundation [0321685, 0703908]
  2. U. S. Department of Agriculture-Agricultural Research Service [58-1907-0-041]
  3. The Donald Danforth Plant Science Center, St. Louis, MO
  4. Cornell University, Ithaca, NY
  5. Direct For Biological Sciences
  6. Division Of Integrative Organismal Systems [0703908, 0321685] Funding Source: National Science Foundation

向作者/读者索取更多资源

Automated evidence-based gene building is a rapid and cost-effective way to provide reliable gene annotations on newly sequenced genomes. One of the limitations of evidence-based gene builders, however, is their requirement for transcriptional evidence-known proteins, full-length cDNAs, or expressed sequence tags (ESTs)-in the species of interest. This limitation is of particular concern for plant genomes, where the rate of genome sequencing is greatly outpacing the rate of EST-and cDNA-sequencing projects. To overcome this limitation, we have developed an evidence-based gene build system (the Gramene pipeline) that can use transcriptional evidence across related species. The Gramene pipeline uses the Ensembl computing infrastructure with a novel data processing scheme. Using the previously annotated plant genomes, the dicot Arabidopsis thaliana and the monocot Oryza sativa, we show that the cross-species ESTs from within monocot or dicot class are a valuable source of evidence for gene predictions. We also find that, using only EST and cross-species evidence, the Gramene pipeline can generate a plant gene set that is comparable in quality to the human genes based on known proteins and full-length cDNAs. We compare the Gramene pipeline to several widely used ab initio gene prediction programs in rice; this comparison shows the pipeline performs favorably at both the gene and exon levels with cross-species gene products only. We discuss the results of testing the pipeline on a 22-Mb region of the newly sequenced maize genome and discuss potential application of the pipeline to other genomes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据