4.6 Article

Compression of FASTQ and SAM Format Sequencing Data

Journal

PLOS ONE
Volume 8, Issue 3, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0059190

Keywords

-

Funding

  1. Wellcome Trust [098051]
  2. Dell Inc.
  3. Wellcome Trust

Ask authors/readers for more resources

Storage and transmission of the data produced by modern DNA sequencing instruments has become a major concern, which prompted the Pistoia Alliance to pose the SequenceSqueeze contest for compression of FASTQ files. We present several compression entries from the competition, Fastqz and Samcomp/Fqzcomp, including the winning entry. These are compared against existing algorithms for both reference based compression (CRAM, Goby) and non-reference based compression (DSRC, BAM) and other recently published competition entries (Quip, SCALCE). The tools are shown to be the new Pareto frontier for FASTQ compression, offering state of the art ratios at affordable CPU costs. All programs are freely available on SourceForge. Fastqz: https://sourceforge.net/projects/fastqz/, fqzcomp: https://sourceforge.net/projects/fqzcomp/, and samcomp: https://sourceforge.net/projects/samcomp/.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available