4.7 Article

f-Statistics estimation and admixture graph construction with Pool-Seq or allele count data using the R package poolfstat

Journal

MOLECULAR ECOLOGY RESOURCES
Volume 22, Issue 4, Pages 1394-1416

Publisher

WILEY
DOI: 10.1111/1755-0998.13557

Keywords

admixture graph; demographic inference; Drosophila suzukii; f-statistics; Pool-Seq

Funding

  1. Agence Nationale de la Recherche [ANR-16-CE02-0015-01, ANR-20-CE02-0018]

Ask authors/readers for more resources

By capturing the structuring of genetic variation across populations, F-statistics have proven effective in inferring demographic history. A reinterpretation of F (and D) parameters has led to unbiased estimators for Pool-Seq data and standard allele count data. The new package poolfstat provides a user-friendly and efficient tool for unraveling complex population genetic histories.
By capturing various patterns of the structuring of genetic variation across populations, f-statistics have proved highly effective for the inference of demographic history. Such statistics are defined as covariances of SNP allele frequency differences among sets of populations without requiring haplotype information and are hence particularly relevant for the analysis of pooled sequencing (Pool-Seq) data. We here propose a reinterpretation of the F (and D) parameters in terms of probability of gene identity and derive from this unified definition unbiased estimators for both Pool-Seq data and standard allele count data obtained from individual genotypes. We implemented these estimators in a new version of the R package poolfstat, which now includes a wide range of inference methods: (i) three-population test of admixture; (ii) four-population test of treeness; (iii) F4-ratio estimation of admixture rates; and (iv) fitting, visualization and (semi-automatic) construction of admixture graphs. A comprehensive evaluation of the methods implemented in poolfstat on both simulated Pool-Seq (with various sequencing coverages and error rates) and allele count data confirmed the accuracy of these approaches, even for the most cost-effective Pool-Seq design involving relatively low sequencing coverages. We further analysed a real Pool-Seq data made of 14 populations of the invasive species Drosophila suzukii, which allowed refining both the demographic history of native populations and the invasion routes followed by this emblematic pest. Our new package poolfstat provides the community with a user-friendly and efficient all-in-one tool to unravel complex population genetic histories from large-size Pool-Seq or allele count SNP data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available