4.7 Article

How to optimize the precision of allele and haplotype frequency estimates using pooled-sequencing data

期刊

MOLECULAR ECOLOGY RESOURCES
卷 18, 期 2, 页码 194-203

出版社

WILEY
DOI: 10.1111/1755-0998.12723

关键词

allele frequency estimation; coverage depth; experimental evolution; fitness; haplotype frequency estimation; population genomics

资金

  1. Agence Nationale de la Recherche [ANR SEAD - ANR-13-ADAP-0011]
  2. Marie Sklodowska-Curie/AgreenSkills Program [FP7-267196]

向作者/读者索取更多资源

Sequencing pools of individuals rather than individuals separately reduces the costs of estimating allele frequencies at many loci in many populations. Theoretical and empirical studies show that sequencing pools comprising a limited number of individuals (typically fewer than 50) provides reliable allele frequency estimates, provided that the DNA pooling and DNA sequencing steps are carefully controlled. Unequal contributions of different individuals to the DNA pool and the mean and variance in sequencing depth both can affect the standard error of allele frequency estimates. To our knowledge, no study separately investigated the effect of these two factors on allele frequency estimates; so that there is currently no method to a priori estimate the relative importance of unequal individual DNA contributions independently of sequencing depth. We develop a new analytical model for allele frequency estimation that explicitly distinguishes these two effects. Our model shows that the DNA pooling variance in a pooled sequencing experiment depends solely on two factors: the number of individuals within the pool and the coefficient of variation of individual DNA contributions to the pool. We present a new method to experimentally estimate this coefficient of variation when planning a pooled sequencing design where samples are either pooled before or after DNA extraction. Using this analytical and experimental framework, we provide guidelines to optimize the design of pooled sequencing experiments. Finally, we sequence replicated pools of inbred lines of the plant Medicago truncatula and show that the predictions from our model generally hold true when estimating the frequency of known multilocus haplotypes using pooled sequencing.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据