4.6 Article

Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology

期刊

PLOS ONE
卷 7, 期 11, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0047768

关键词

-

资金

  1. NHGRI [NHGRI U54 HG003273]
  2. NCRR [1S10RR026605-01]

向作者/读者索取更多资源

Many genomes have been sequenced to high-quality draft status using Sanger capillary electrophoresis and/or newer short-read sequence data and whole genome assembly techniques. However, even the best draft genomes contain gaps and other imperfections due to limitations in the input data and the techniques used to build draft assemblies. Sequencing biases, repetitive genomic features, genomic polymorphism, and other complicating factors all come together to make some regions difficult or impossible to assemble. Traditionally, draft genomes were upgraded to phase 3 finished status using time-consuming and expensive Sanger-based manual finishing processes. For more facile assembly and automated finishing of draft genomes, we present here an automated approach to finishing using long-reads from the Pacific Biosciences RS (PacBio) platform. Our algorithm and associated software tool, PBJelly, (publicly available at https://sourceforge.net/projects/pb-jelly/) automates the finishing process using long sequence reads in a reference-guided assembly process. PBJelly also provides lift-over co-ordinate tables to easily port existing annotations to the upgraded assembly. Using PBJelly and long PacBio reads, we upgraded the draft genome sequences of a simulated Drosophila melanogaster, the version 2 draft Drosophila pseudoobscura, an assembly of the Assemblathon 2.0 budgerigar dataset, and a preliminary assembly of the Sooty mangabey. With 24xmapped coverage of PacBio long-reads, we addressed 99% of gaps and were able to close 69% and improve 12% of all gaps in D. pseudoobscura. With 4xmapped coverage of PacBio long-reads we saw reads address 63% of gaps in our budgerigar assembly, of which 32% were closed and 63% improved. With 6.8xmapped coverage of mangabey PacBio long-reads we addressed 97% of gaps and closed 66% of addressed gaps and improved 19%. The accuracy of gap closure was validated by comparison to Sanger sequencing on gaps from the original D. pseudoobscura draft assembly and shown to be dependent on initial reference quality.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据