4.8 Article

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel

期刊

NATURE COMMUNICATIONS
卷 5, 期 -, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/ncomms4934

关键词

-

资金

  1. Medical Research Council [G0801823]
  2. BBSRC [BB/I02593X/1] Funding Source: UKRI
  3. MRC [G0801823] Funding Source: UKRI
  4. Biotechnology and Biological Sciences Research Council [BB/I02593X/1] Funding Source: researchfish
  5. Medical Research Council [G0801823] Funding Source: researchfish

向作者/读者索取更多资源

A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据