4.6 Article

Species-Specific Quality Control, Assembly and Contamination Detection in Microbial Isolate Sequences with AQUAMIS

期刊

GENES
卷 12, 期 5, 页码 -

出版社

MDPI
DOI: 10.3390/genes12050644

关键词

whole genome sequencing; next generation sequencing; quality control; isolate sequencing; pipeline; assembly; contamination; reproducibility; interoperability

资金

  1. Federal Government
  2. Ministry of Health [ZMVI1-2518FSB709]
  3. BeONE project within the One Health European Joint Programme (OHEJP) [JRP27-R2-FBZ-BeONE]

向作者/读者索取更多资源

AQUAMIS is a Snakemake workflow designed for extensive quality control and assembly of raw Illumina sequencing data, allowing laboratories to automate their initial analysis of microbial whole-genome sequencing data. The workflow performs all steps of primary sequence analysis and visualizes the results in an interactive HTML report with species-specific QC thresholds, as well as providing a standard-compliant JSON file for easy downstream analyses. Its ability to reliably predict contaminations in high throughput routine sequencing environments has been demonstrated through analysis of thousands of microbial isolates and in-silico contaminated datasets. Intergenus and intragenus contaminations can be most accurately detected using a combination of different QC metrics available within AQUAMIS.
Sequencing of whole microbial genomes has become a standard procedure for cluster detection, source tracking, outbreak investigation and surveillance of many microorganisms. An increasing number of laboratories are currently in a transition phase from classical methods towards next generation sequencing, generating unprecedented amounts of data. Since the precision of downstream analyses depends significantly on the quality of raw data generated on the sequencing instrument, a comprehensive, meaningful primary quality control is indispensable. Here, we present AQUAMIS, a Snakemake workflow for an extensive quality control and assembly of raw Illumina sequencing data, allowing laboratories to automatize the initial analysis of their microbial whole-genome sequencing data. AQUAMIS performs all steps of primary sequence analysis, consisting of read trimming, read quality control (QC), taxonomic classification, de-novo assembly, reference identification, assembly QC and contamination detection, both on the read and assembly level. The results are visualized in an interactive HTML report including species-specific QC thresholds, allowing non-bioinformaticians to assess the quality of sequencing experiments at a glance. All results are also available as a standard-compliant JSON file, facilitating easy downstream analyses and data exchange. We have applied AQUAMIS to analyze similar to 13,000 microbial isolates as well as similar to 1000 in-silico contaminated datasets, proving the workflow's ability to perform in high throughput routine sequencing environments and reliably predict contaminations. We found that intergenus and intragenus contaminations can be detected most accurately using a combination of different QC metrics available within AQUAMIS.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据