4.7 Article

Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing

期刊

BIOINFORMATICS
卷 27, 期 2, 页码 189-195

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btq648

关键词

-

资金

  1. National Science Foundation [CCF-0621443, SDCI OCI-0724599, CNS-0551639, IIS-0536994, HECURA-0938000]
  2. Department of Energy [DE-FC02-07ER25808, DE-FG02-08ER25848]
  3. Direct For Computer & Info Scie & Enginr
  4. Division of Computing and Communication Foundations [0938000] Funding Source: National Science Foundation

向作者/读者索取更多资源

Motivation: Recently, a number of programs have been proposed for mapping short reads to a reference genome. Many of them are heavily optimized for short-read mapping and hence are very efficient for shorter queries, but that makes them inefficient or not applicable for reads longer than 200 bp. However, many sequencers are already generating longer reads and more are expected to follow. For long read sequence mapping, there are limited options; BLAT, SSAHA2, FANGS and BWA-SW are among the popular ones. However, resequencing and personalized medicine need much faster software to map these long sequencing reads to a reference genome to identify SNPs or rare transcripts. Results: We present AGILE (AliGnIng Long rEads), a hash table based high-throughput sequence mapping algorithm for longer 454 reads that uses diagonal multiple seed-match criteria, customized q-gram filtering and a dynamic incremental search approach among other heuristics to optimize every step of the mapping process. In our experiments, we observe that AGILE is more accurate than BLAT, and comparable to BWA-SW and SSAHA2. For practical error rates (< 5%) and read lengths (200-1000 bp), AGILE is significantly faster than BLAT, SSAHA2 and BWA-SW. Even for the other cases, AGILE is comparable to BWA-SW and several times faster than BLAT and SSAHA2.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据