4.5 Article

Separating homeologs by phasing in the tetraploid wheat transcriptome

期刊

GENOME BIOLOGY
卷 14, 期 6, 页码 -

出版社

BMC
DOI: 10.1186/gb-2013-14-6-r66

关键词

Transcriptome assembly; multiple k-mer assembly; wheat; polyploid; Triticum urartu; Triticum turgidum; pseudogenes; phasing; gene prediction

资金

  1. Howard Hughes Medical Institute
  2. Gordon and Betty Moore Foundation [GBMF3031]
  3. National Research Initiative from the USDA National Institute of Food and Agriculture [2011-68002-30029, 2011-67013-30077]
  4. Biotechnology and Biological Sciences Research Council (BBSRC) [BB/J003557/1]
  5. USDA NIFA [2012-67012-19811]
  6. BBSRC [BBS/E/T/000PR6193, BB/J003557/1, BB/I000712/1, BB/J003743/1, BBS/E/T/000PR5885, BBS/E/J/000C0628] Funding Source: UKRI
  7. Biotechnology and Biological Sciences Research Council [BBS/E/J/000C0628, BB/J003743/1, BB/J003557/1, BBS/E/T/000PR6193, BB/I000712/1, BBS/E/T/000PR5885] Funding Source: researchfish

向作者/读者索取更多资源

Background: The high level of identity among duplicated homoeologous genomes in tetraploid pasta wheat presents substantial challenges for de novo transcriptome assembly. To solve this problem, we develop a specialized bioinformatics workflow that optimizes transcriptome assembly and separation of merged homoeologs. To evaluate our strategy, we sequence and assemble the transcriptome of one of the diploid ancestors of pasta wheat, and compare both assemblies with a benchmark set of 13,472 full-length, non-redundant bread wheat cDNAs. Results: A total of 489 million 100 bp paired-end reads from tetraploid wheat assemble in 140,118 contigs, including 96% of the benchmark cDNAs. We used a comparative genomics approach to annotate 66,633 open reading frames. The multiple k-mer assembly strategy increases the proportion of cDNAs assembled full-length in a single contig by 22% relative to the best single k-mer size. Homoeologs are separated using a post-assembly pipeline that includes polymorphism identification, phasing of SNPs, read sorting, and re-assembly of phased reads. Using a reference set of genes, we determine that 98.7% of SNPs analyzed are correctly separated by phasing. Conclusions: Our study shows that de novo transcriptome assembly of tetraploid wheat benefit from multiple k-mer assembly strategies more than diploid wheat. Our results also demonstrate that phasing approaches originally designed for heterozygous diploid organisms can be used to separate the close homoeologous genomes of tetraploid wheat. The predicted tetraploid wheat proteome and gene models provide a valuable tool for the wheat research community and for those interested in comparative genomic studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据