4.8 Article

Maximum Likelihood Estimation of Frequencies of Known Haplotypes from Pooled Sequence Data

期刊

MOLECULAR BIOLOGY AND EVOLUTION
卷 30, 期 5, 页码 1145-1158

出版社

OXFORD UNIV PRESS
DOI: 10.1093/molbev/mst016

关键词

maximum likelihood; EM algorithm; haplotype frequency estimation; pooled sequence data; metagenomics

资金

  1. National Institutes of Health [T32 HG002536, R01 HG007089, R01 GM053275, R01 GM098614]
  2. National Science Foundation [EF-0928690]
  3. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [R01HG007089, T32HG002536] Funding Source: NIH RePORTER
  4. NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES [R01GM098614, R01GM053275] Funding Source: NIH RePORTER
  5. Emerging Frontiers [0928690] Funding Source: National Science Foundation

向作者/读者索取更多资源

DNA samples are often pooled, either by experimental design or because the sample itself is a mixture. For example, when population allele frequencies are of primary interest, individual samples may be pooled together to lower the cost of sequencing. Alternatively, the sample itself may be a mixture of multiple species or strains (e.g., bacterial species comprising a microbiome or pathogen strains in a blood sample). We present an expectation-maximization algorithm for estimating haplotype frequencies in a pooled sample directly from mapped sequence reads, in the case where the possible haplotypes are known. This method is relevant to the analysis of pooled sequencing data from selection experiments, as well as the calculation of proportions of different species within a metagenomics sample. Our method outperforms existing methods based on single-site allele frequencies, as well as simple approaches using sequence read data. We have implemented the method in a freely available open-source software tool.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据