4.3 Article

Adjustment for Population Stratification via Principal Components in Association Analysis of Rare Variants

期刊

GENETIC EPIDEMIOLOGY
卷 37, 期 1, 页码 99-109

出版社

WILEY
DOI: 10.1002/gepi.21691

关键词

1000 Genomes Project; association tests; logistic regression; next-generation sequencing; SNP; SSU test

资金

  1. NIH [R21DK089351, R01HL65462, R01HL105397, R01GM081535]
  2. NATIONAL HEART, LUNG, AND BLOOD INSTITUTE [R01HL065462, R01HL105397] Funding Source: NIH RePORTER
  3. NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES [R21DK089351] Funding Source: NIH RePORTER
  4. NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES [R01GM081535] Funding Source: NIH RePORTER

向作者/读者索取更多资源

For unrelated samples, principal component (PC) analysis has been established as a simple and effective approach to adjusting for population stratification in association analysis of common variants (CVs, with minor allele frequencies MAF > 5%). However, it is less clear how it would perform in analysis of low-frequency variants (LFVs, MAF between 1% and 5%), or of rare variants (RVs, MAF < 5%). Furthermore, with next-generation sequencing data, it is unknown whether PCs should be constructed based on CVs, LFVs, or RVs. In this study, we used the 1000 Genomes Project sequence data to explore the construction of PCs and their use in association analysis of LFVs or RVs for unrelated samples. It is shown that a few top PCs based on either CVs or LFVs could separate two continental groups, European and African samples, but those based on only RVs performed less well. When applied to several association tests in simulated data with population stratification, using PCs based on either CVs or LFVs was effective in controlling Type I error rates, while nonadjustment led to inflated Type I error rates. Perhaps the most interesting observation is that, although the PCs based on LFVs could better separate the two continental groups than those based on CVs, the use of the former could lead to overadjustment in the sense of substantial power loss in the absence of population stratification; in contrast, we did not see any problem with the use of the PCs based on CVs in all our examples.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据