4.7 Article

Illumina Synthetic Long Read Sequencing Allows Recovery of Missing Sequences even in the Finished C. elegans Genome

期刊

SCIENTIFIC REPORTS
卷 5, 期 -, 页码 -

出版社

NATURE PUBLISHING GROUP
DOI: 10.1038/srep10814

关键词

-

资金

  1. Early Career Scheme (ECS) Fund [HKBU263512]
  2. Collaborative Research Fund (CRF) of Hong Kong Research Grants Council (RGC) [HKBU5/CRF/11G]

向作者/读者索取更多资源

Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects using isogenic C. elegans genome with no gap. First, the reads are highly accurate and capable of recovering most types of repetitive sequences. However, the presence of tandem repetitive sequences prevents pre-assembly of long reads in the relevant genomic region. Second, the reads are able to reliably detect missing but not extra sequences in the C. elegans genome. Third, the reads of smaller size are more capable of recovering repetitive sequences than those of bigger size. Fourth, at least 40 Kbp missing genomic sequences are recovered in the C. elegans genome using the long reads. Finally, an N50 contig size of at least 86 Kbp can be achieved with 24 x reads but with substantial mis-assembly errors, highlighting a need for novel assembly algorithm for the long reads.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据