4.6 Article

SOAPBarcode: revealing arthropod biodiversity through assembly of Illumina shotgun sequences of PCR amplicons

期刊

METHODS IN ECOLOGY AND EVOLUTION
卷 4, 期 12, 页码 1142-1150

出版社

WILEY
DOI: 10.1111/2041-210X.12120

关键词

high-throughput sequencing; metabarcoding; next-generation-sequencing; operational taxonomic units; phylogenetic diversity; species richness; standard barcode

类别

资金

  1. National High-tech Research and Development Project (863) of China [2012AA021601]
  2. BGI
  3. Yunnan Province [20080A001]
  4. Chinese Academy of Sciences [0902281081, KSCX2-YW-Z-1027]
  5. National Natural Science Foundation of China [31170498]
  6. Ministry of Science and Technology of China [2012FY110800]
  7. University of East Anglia
  8. State Key Laboratory of Genetic Resources and Evolution at the Kunming Institute of Zoology

向作者/读者索取更多资源

Metabarcoding of mixed arthropod samples for biodiversity assessment has mostly been carried out on the 454 GS FLX sequencer (Roche, Branford, Connecticut, USA), due to its ability to produce long reads (400bp) that are believed to allow higher taxonomic resolution. The Illumina sequencing platforms, with their much higher throughputs, could potentially reduce sequencing costs and improve sequence quality, but the associated shorter read length (typically <150bp) has deterred their usage in next-generation-sequencing (NGS)-based analyses of eukaryotic biodiversity, which often utilize standard barcode markers (e.g. COI, rbcL, matK, ITS) that are hundreds of nucleotides long. We present a new Illumina-based pipeline to recover full-length COI barcodes from mixed arthropod samples. Our new assembly program, SOAPBarcode, a variant of the genome assembly program SOAPdenovo, uses paired-end reads of the standard COI barcode region as anchors to extract the correct pathways (sequences) out of otherwise chaotic de Bruijn graphs', which are caused by the presence of large numbers of COI homologs of high sequence similarity. Two bulk insect samples of known species composition have been analysed in a recently published 454 metabarcoding study (Yu etal. 2012) and are re-analysed by our analysis pipeline. Compared to the results of Roche 454 (c.400-bp reads), our pipeline recovered full-length COI barcodes (658bp) and 17-31% more species-level operational taxonomic units (OTUs) from bulk insect samples, with fewer untraceable (novel) OTUs. On the other hand, our PCR-based pipeline also revealed higher rates of contamination across samples, due to the Illumina's increased sequencing depth. On balance, the assembled full-length barcodes and increased OTU recovery rates resulted in more resolved taxonomic assignments and more accurate beta diversity estimation. The HiSeq 2000 and the SOAPBarcode pipeline together can achieve more accurate biodiversity assessment at a much reduced sequencing cost in metabarcoding analyses. However, greater precaution is needed to prevent cross-sample contamination during field preparation and laboratory operation because of greater ability to detect non-target DNA amplicons present in low-copy numbers.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据