4.6 Article

Rapid evaluation and quality control of next generation sequencing data with FaQCs

Journal

BMC BIOINFORMATICS
Volume 15, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s12859-014-0366-2

Keywords

Quality control; Trimming; Next generation sequencing analysis; Data preprocessing

Funding

  1. U.S. Department of Energy Joint Genome Institute through the Office of Science of the U.S. Department of Energy [DE-AC02-05CH11231]
  2. NIH [Y1-DE-6006-02]
  3. U.S. Department of Homeland Security [HSHQDC08X00790]
  4. U.S. Defense Threat Reduction Agency [B104153I, B084531I]

Ask authors/readers for more resources

Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available