4.6 Article

PyroClean: Denoising Pyrosequences from Protein-Coding Amplicons for the Recovery of Interspecific and Intraspecific Genetic Variation

期刊

PLOS ONE
卷 8, 期 3, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0057615

关键词

-

资金

  1. Leverhulme Trust
  2. Genome Analysis Centre (TGAC)
  3. University of East Anglia
  4. Consejo Superior de Investigaciones Cientificas
  5. Yunnan Province [20080A001]
  6. Chinese Academy of Sciences [0902281081, KSCX2-YW-Z-1027, Y002731079]
  7. National Natural Science Foundation of China [31170498]
  8. Ministry of Science and Technology of China [2012FY110800]
  9. State Key Laboratory of Genetic Resources and Evolution at the Kunming Institute of Zoology
  10. BBSRC [BBS/E/T/000PR5885, BBS/E/T/000PR6193] Funding Source: UKRI
  11. Biotechnology and Biological Sciences Research Council [BBS/E/T/000PR5885, BBS/E/T/000PR6193] Funding Source: researchfish

向作者/读者索取更多资源

High-throughput parallel sequencing is a powerful tool for the quantification of microbial diversity through the amplification of nuclear ribosomal gene regions. Recent work has extended this approach to the quantification of diversity within otherwise difficult-to-study metazoan groups. However, nuclear ribosomal genes present both analytical challenges and practical limitations that are a consequence of the mutational properties of nuclear ribosomal genes. Here we exploit useful properties of protein-coding genes for cross-species amplification and denoising of 454 flowgrams. We first use experimental mixtures of species from the class Collembola to amplify and pyrosequence the 59 region of the COI barcode, and we implement a new algorithm called PyroClean for the denoising of Roche GS FLX pyrosequences. Using parameter values from the analysis of experimental mixtures, we then analyse two communities sampled from field sites on the island of Tenerife. Cross-species amplification success of target mitochondrial sequences in experimental species mixtures is high; however, there is little relationship between template DNA concentrations and pyrosequencing read abundance. Homopolymer error correction and filtering against a consensus reference sequence reduced the volume of unique sequences to approximately 5% of the original unique raw reads. Filtering of remaining non-target sequences attributed to PCR error, sequencing error, or numts further reduced unique sequence volume to 0.8% of the original raw reads. PyroClean reduces or eliminates the need for an additional, time-consuming step to cluster reads into Operational Taxonomic Units, which facilitates the detection of intraspecific DNA sequence variation. PyroCleaned sequence data from field sites in Tenerife demonstrate the utility of our approach for quantifying evolutionary diversity and its spatial structure. Comparison of our sequence data to public databases reveals that we are able to successfully recover both interspecific and intraspecific sequence diversity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据