Journal
MOLECULAR ECOLOGY
Volume 22, Issue 11, Pages 3165-3178Publisher
WILEY
DOI: 10.1111/mec.12089
Keywords
allele dropout; allele frequency; F ST; heterozygosity; next-generation sequencing; RAD markers; single-nucleotide polymorphisms
Funding
- Institut National de la Recherche Agronomique
- French 'Agence Nationale de la Recherche' through the ANR project GenoPheno [2010-JCJC-1705-01]
- UK Natural Environment Research Council [R8/H10/56]
- Medical Research Council [G0900740]
- BBSRC [BB/H023844/1] Funding Source: UKRI
- MRC [G0900740] Funding Source: UKRI
- NERC [NE/H019804/1, NBAF010003] Funding Source: UKRI
- Biotechnology and Biological Sciences Research Council [BB/H023844/1] Funding Source: researchfish
- Medical Research Council [G0900740] Funding Source: researchfish
- Natural Environment Research Council [NBAF010003, NE/H019804/1] Funding Source: researchfish
Ask authors/readers for more resources
Inexpensive short-read sequencing technologies applied to reduced representation genomes is revolutionizing genetic research, especially population genetics analysis, by allowing the genotyping of massive numbers of single-nucleotide polymorphisms (SNP) for large numbers of individuals and populations. Restriction site-associated DNA (RAD) sequencing is a recent technique based on the characterization of genomic regions flanking restriction sites. One of its potential drawbacks is the presence of polymorphism within the restriction site, which makes it impossible to observe the associated SNP allele (i.e. allele dropout, ADO). To investigate the effect of ADO on genetic variation estimated from RAD markers, we first mathematically derived measures of the effect of ADO on allele frequencies as a function of different parameters within a single population. We then used RAD data sets simulated using a coalescence model to investigate the magnitude of biases induced by ADO on the estimation of expected heterozygosity and FST under a simple demographic model of divergence between two populations. We found that ADO tends to overestimate genetic variation both within and between populations. Assuming a mutation rate per nucleotide between 10-9 and 10-8, this bias remained low for most studied combinations of divergence time and effective population size, except for large effective population sizes. Averaging FST values over multiple SNPs, for example, by sliding window analysis, did not correct ADO biases. We briefly discuss possible solutions to filter the most problematic cases of ADO using read coverage to detect markers with a large excess of null alleles.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available