4.6 Article

Anacapa Toolkit: An environmental DNA toolkit for processing multilocus metabarcode datasets

Journal

METHODS IN ECOLOGY AND EVOLUTION
Volume 10, Issue 9, Pages 1469-1475

Publisher

WILEY
DOI: 10.1111/2041-210X.13214

Keywords

Bayesian methods; biodiversity; community ecology; environmental DNA; metabarcoding; molecular methods; multilocus metabarcoding processing; sequence data

Categories

Funding

  1. OISE [1243541]
  2. NSF-DEB [1644641]
  3. University of California President's Research Catalyst Award [CA-16-376437]
  4. NSF-DGE [1650604]
  5. NSF-GRFP [2015204395]
  6. University of California
  7. Gordon and Betty Moore Foundation [6864]
  8. Direct For Biological Sciences [1644641] Funding Source: National Science Foundation
  9. Division Of Environmental Biology [1644641] Funding Source: National Science Foundation
  10. Division Of Graduate Education
  11. Direct For Education and Human Resources [1650604] Funding Source: National Science Foundation

Ask authors/readers for more resources

Environmental DNA (eDNA) metabarcoding is a promising method to monitor species and community diversity that is rapid, affordable and non-invasive. The longstanding needs of the eDNA community are modular informatics tools, comprehensive and customizable reference databases, flexibility across high-throughput sequencing platforms, fast multilocus metabarcode processing and accurate taxonomic assignment. Improvements in bioinformatics tools make addressing each of these demands within a single toolkit a reality. The new modular metabarcode sequence toolkit Anacapa () addresses the above needs, allowing users to build comprehensive reference databases and assign taxonomy to raw multilocus metabarcode sequence data. A novel aspect of Anacapa is its database building module, Creating Reference libraries Using eXisting tools (CRUX), which generates comprehensive reference databases for specific user-defined metabarcoding loci. The Quality Control and ASV Parsing module sorts and processes multiple metabarcoding loci and processes merged, unmerged and unpaired reads maximizing recovered diversity. DADA2 then detects amplicon sequence variants (ASVs) and the Anacapa Classifier module aligns these ASVs to CRUX-generated reference databases using Bowtie2. Lastly, taxonomy is assigned to ASVs with confidence scores using a Bayesian Lowest Common Ancestor (BLCA) method. The Anacapa Toolkit also includes an r package, ranacapa, for automated results exploration through standard biodiversity statistical analysis. Benchmarking tests verify that the Anacapa Toolkit effectively and efficiently generates comprehensive reference databases that capture taxonomic diversity, and can assign taxonomy to both MiSeq and HiSeq-length sequence data. We demonstrate the value of the Anacapa Toolkit in assigning taxonomy to seawater eDNA samples collected in southern California. The Anacapa Toolkit improves the functionality of eDNA and streamlines biodiversity assessment and management by generating metabarcode specific databases, processing multilocus data, retaining a larger proportion of sequencing reads and expanding non-traditional eDNA targets. All the components of the Anacapa Toolkit are open and available in a virtual container to ease installation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available