4.6 Article

A geometric relationship of F2, F3 and F4-statistics with principal component analysis

出版社

ROYAL SOC
DOI: 10.1098/rstb.2020.0413

关键词

population structure; principal component analysis; F-statistics

类别

向作者/读者索取更多资源

This article explores the relationship between principal component analysis (PCA) and F-statistics in studying human genetic variation. The author derives explicit connections between the two approaches and demonstrates that F-statistics can be interpreted geometrically in the context of PCA. The results show that PCA plots are effective at predicting F-statistics, providing a new perspective for understanding human genetic variation.
Principal component analysis (PCA) and F-statistics sensu Patterson are two of the most widely used population genetic tools to study human genetic variation. Here, I derive explicit connections between the two approaches and show that these two methods are closely related. F-statistics have a simple geometrical interpretation in the context of PCA, and orthogonal projections are a key concept to establish this link. I show that for any pair of populations, any population that is admixed as determined by an F-3-statistic will lie inside a circle on a PCA plot. Furthermore, the F-4-statistic is closely related to an angle measurement, and will be zero if the differences between pairs of populations intersect at a right angle in PCA space. I illustrate my results on two examples, one of Western Eurasian, and one of global human diversity. In both examples, I find that the first few PCs are sufficient to approximate most F-statistics, and that PCA plots are effective at predicting F-statistics. Thus, while F-statistics are commonly understood in terms of discrete populations, the geometric perspective illustrates that they can be viewed in a framework of populations that vary in a more continuous manner.This article is part of the theme issue 'Celebrating 50 years since Lewontin's apportionment of human diversity'.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据