4.5 Article

PECC: Correcting contigs based on paired-end read distribution

期刊

COMPUTATIONAL BIOLOGY AND CHEMISTRY
卷 69, 期 -, 页码 178-184

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.compbiolchem.2017.03.012

关键词

Next generation sequencing; De novo assembly; Contigs; Paired-end reads

资金

  1. National Science Fund [61622213]
  2. National Natural Science Foundation of China [61232001, 61420106009, 61379108, 61370172]

向作者/读者索取更多资源

Motivation: Cheap and fast next generation sequencing (NGS) technologies facilitate research of de novo assembly greatly. The reliability of contigs is critical to construct reliable scaffolding. However, contigs generated from most assemblers contain errors because of the limitation of assembly strategy and computation complexity. Among all these errors, the misassembly error is one of the most harmful types. Results: In this paper, we propose a new method named PECC to identify and correct misassembly errors in contigs based on the paired-end read distribution. PECC extracts sequence regions with lower paired end reads supports and verifies them based on the distribution of paired-end supports. To validate the effectiveness of PECC, we applied PECC to the contigs produced by five popular assemblers on four real datasets, and we also carried out experiments to analyze the influences of PECC on scaffolding. The results show that PECC can reduce misassembly errors and improve the performance of scaffolding results, which demonstrate the promising applications of PECC in de novo assembly. (C) 2017 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据