4.8 Article

Limitations of principal components in quantitative genetic association models for human studies

期刊

ELIFE
卷 12, 期 -, 页码 -

出版社

eLIFE SCIENCES PUBL LTD
DOI: 10.7554/eLife.79238

关键词

genetic association; statistical genetics; population structure; cryptic relatedness; complex quantitative traits; multiethnic human data and simulations

类别

向作者/读者索取更多资源

Principal Component Analysis (PCA) and the Linear Mixed-effects Model (LMM) are common genetic association models, but PCA performs poorly in modeling complex relatedness structures, while LMM usually performs better. Environment effects and family relatedness have significant impacts on association studies in human datasets, and should be better modeled using LMM with appropriate labels.
Principal Component Analysis (PCA) and the Linear Mixed-effects Model (LMM), sometimes in combination, are the most common genetic association models. Previous PCA-LMM comparisons give mixed results, unclear guidance, and have several limitations, including not varying the number of principal components (PCs), simulating simple population structures, and inconsistent use of real data and power evaluations. We evaluate PCA and LMM both varying number of PCs in realistic genotype and complex trait simulations including admixed families, subpopulation trees, and real multiethnic human datasets with simulated traits. We find that LMM without PCs usually performs best, with the largest effects in family simulations and real human datasets and traits without environment effects. Poor PCA performance on human datasets is driven by large numbers of distant relatives more than the smaller number of closer relatives. While PCA was known to fail on family data, we report strong effects of family relatedness in genetically diverse human datasets, not avoided by pruning close relatives. Environment effects driven by geography and ethnicity are better modeled with LMM including those labels instead of PCs. This work better characterizes the severe limitations of PCA compared to LMM in modeling the complex relatedness structures of multiethnic human data for association studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据