4.7 Article

Do pseudogenes pose a problem for metabarcoding marine animal communities?

Journal

MOLECULAR ECOLOGY RESOURCES
Volume 22, Issue 8, Pages 2897-2914

Publisher

WILEY
DOI: 10.1111/1755-0998.13667

Keywords

COI; DNA barcoding; high-throughput sequencing; marine biodiversity; nuclear mitochondrial pseudogenes; NUMTs

Funding

  1. Natural Sciences and Engineering Research Council of Canada
  2. New Frontiers in Research Fund [NFRFT-2020-00073]
  3. Ontario Genomics
  4. Genome Canada
  5. Government of Canada

Ask authors/readers for more resources

This study quantifies the occurrence and attributes of nuclear mitochondrial pseudogenes (NUMTs) in marine animal genomes, which can inflate diversity. Short amplicons pose the greatest interpretational risk, but considering both amplicon length and position can minimize the effects of NUMTs on operational taxonomic unit (OTU) counts and barcode variation.
Because DNA metabarcoding typically employs sequence diversity among mitochondrial amplicons to estimate species composition, nuclear mitochondrial pseudogenes (NUMTs) can inflate diversity. This study quantifies the incidence and attributes of NUMTs derived from the 658-bp barcode region of cytochrome c oxidase I (COI) in 156 marine animal genomes. NUMTs were examined to ascertain if they could be recognized by their possession of indels or stop codons. In total, 309 NUMTs >= 150 bp were detected, with an average of 1.98 per species (range = 0-33) and a mean length of 391 +/- 200 bp. Among this total, 75 (24.3%) lacked indels or stop codons. NUMTs appear to pose the greatest interpretational risk when short (<313 bp) amplicons are used, such as in environmental DNA studies, dietary analyses or processed fish identification. Employing the standard amplicon length (313 bp) for marine metabarcoding, NUMTs could potentially inflate the operational taxonomic unit (OTU) count by 21% above the true species count while also raising intraspecific variation at COI by 15%. However, when both amplicon length and position are considered, inflation in OTU counts and in barcode variation were just 9% and 10%, respectively, suggesting NUMTs will not seriously distort biodiversity assessments. There was a weak positive correlation between genome size and NUMT count but no variation among phyla or trophic groups. Until bioinformatic advances improve NUMT detection, the best defence involves targeting long amplicons and developing reference databases that include both mitochondrial sequences and their NUMT derivatives.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available