4.3 Article

A dictionary model for haplotyping, genotype calling, and association testing

期刊

GENETIC EPIDEMIOLOGY
卷 31, 期 7, 页码 672-683

出版社

WILEY
DOI: 10.1002/gepi.20232

关键词

linkage disequilibrium; forward algorithm; backward algorithm; Gibbs sampling

资金

  1. NHGRI NIH HHS [HG02536] Funding Source: Medline
  2. NIGMS NIH HHS [GM53275] Funding Source: Medline
  3. NIMH NIH HHS [MH59490] Funding Source: Medline

向作者/读者索取更多资源

We propose a new method for haplotyping, genotype calling, and association testing based on a dictionary model for haplotypes. In this framework, a haplotype arises as a concatenation of conserved haplotype segments, drawn from a predefined dictionary according to segment specific probabilities. The observed data consist of unphased multimarker genotypes gathered on a random sample of unrelated individuals. These genotypes are subject to mutation, genotyping errors, and missing data. The true pair of haplotypes corresponding to a person's multimarker genotype is reconstructed using a Markov chain that visits haplotype pairs according to their posterior probabilities. Our implementation of the chain alternates Gibbs steps, which rearrange the phase of a single marker, and Metropolis steps, which swap maternal and paternal haplotypes from a given maker onward. Output of the chain include the most likely haplotype pairs, the most likely genotypes at each marker, and the expected number of occurrences of each haplotype segment. Reconstruction accuracy is comparable to that achieved by the best existing algorithms. More importantly, the dictionary model yields expected counts of conserved haplotype segments. These imputed counts can serve as genetic predictors in association studies, as we illustrate by examples on cystic fibrosis, Friedreich's ataxia, and angiotensin-I converting enzyme levels.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据