Journal
BMC BIOINFORMATICS
Volume 21, Issue 1, Pages -Publisher
BMC
DOI: 10.1186/s12859-020-03701-4
Keywords
RADseq; SNP analysis; Population genomics; Population structure; Admixtureanalysis
Categories
Funding
- University of Arkansas
Ask authors/readers for more resources
Background: Research on the molecular ecology of non-model organisms, while previously constrained, has now been greatly facilitated by the advent of reduced-representation sequencing protocols. However, tools that allow these large datasets to be efficiently parsed are often lacking, or if indeed available, then limited by the necessity of a comparable reference genome as an adjunct. This, of course, can be difficult when working with non-model organisms. Fortunately, pipelines are currently available that avoid this prerequisite, thus allowing data to be a priori parsed. An oft-used molecular ecology program (i.e., Structure), for example, is facilitated by such pipelines, yet they are surprisingly absent for a second program that is similarly popular and computationally more efficient (i.e., Admixture). The two programs differ in that Admixtureemploys a maximum-likelihood framework whereas Structureuses a Bayesian approach, yet both produce similar results. Given these issues, there is an overriding (and recognized) need among researchers in molecular ecology for bioinformatic software that will not only condense output from replicated Admixtureruns, but also infer from these data the optimal number of population clusters (K). Results: Here we provide such a program (i.e., AdmixPipe) that (a) filters SNPs to allow the delineation of population structure in Admixture, then (b) parses the output for summarization and graphical representation via Clumpak. Our benchmarks effectively demonstrate how efficient the pipeline is for processing large, non-model datasets generated via double digest restriction-site associated DNA sequencing (ddRAD). Outputs not only parallel those from Structure, but also visualize the variation among individual Admixtureruns, so as to facilitate selection of the most appropriateK-value. Conclusions: AdmixPipesuccessfully integrates Admixtureanalysis with popular variant call format (VCF) filtering software to yield file types readily analyzed by Clumpak. Large population genomic datasets derived from non-model organisms are efficiently analyzed via the parallel-processing capabilities of Admixture. AdmixPipeis distributed under the GNU Public License and freely available for Mac OSX and Linux platforms at:.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available