4.4 Article

A rarefaction approach for measuring population differences in rare and common variation

Journal

GENETICS
Volume 224, Issue 2, Pages -

Publisher

GENETICS SOCIETY AMERICA
DOI: 10.1093/genetics/iyad070

Keywords

common variants; rare variants; rarefaction; sample-size correction

Ask authors/readers for more resources

When studying allele-frequency variation across populations, it is important to account for discreteness effects due to differences in sample sizes. This study introduces a rarefaction-based sample-size correction that compares rare and common variation across multiple populations with potentially different sample sizes. The results highlight subtle differences in allele-frequency patterns across populations and provide insights into allele classifications and their dependencies on subsample sizes.
In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as rare, with nonzero frequency less than or equal to a specified threshold, common, with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating rare and common corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available