4.3 Article

Suitability of Different Mapping Algorithms for Genome-Wide Polymorphism Scans with Pool-Seq Data

期刊

G3-GENES GENOMES GENETICS
卷 6, 期 11, 页码 3507-3515

出版社

OXFORD UNIV PRESS INC
DOI: 10.1534/g3.116.034488

关键词

Pool-Seq; bioinformatics; Next Generation Sequencing; mapping algorithm; Drosophila

资金

  1. European Research Council
  2. Austrian Science Funds [FWF-W1225]

向作者/读者索取更多资源

The cost-effectiveness of sequencing pools of individuals (Pool-Seq) provides the basis for the popularity and widespread use of this method for many research questions, ranging from unraveling the genetic basis of complex traits, to the clonal evolution of cancer cells. Because the accuracy of Pool-Seq could be affected by many potential sources of error, several studies have determined, for example, the influence of sequencing technology, the library preparation protocol, and mapping parameters. Nevertheless, the impact of the mapping tools has not yet been evaluated. Using simulated and real Pool-Seq data, we demonstrate a substantial impact of the mapping tools, leading to characteristic false positives in genome-wide scans. The problem of false positives was particularly pronounced when data with different read lengths and insert sizes were compared. Out of 14 evaluated algorithms novoalign, bwa mem and clc4 are most suitable for mapping Pool-Seq data. Nevertheless, no single algorithm is sufficient for avoiding all false positives. We show that the intersection of the results of two mapping algorithms provides a simple, yet effective, strategy to eliminate false positives. We propose that the implementation of a consistent Pool-Seq bioinformatics pipeline, building on the recommendations of this study, can substantially increase the reliability of Pool-Seq results, in particular when libraries generated with different protocols are being compared.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据