4.7 Article

QVZ: lossy compression of quality values

期刊

BIOINFORMATICS
卷 31, 期 19, 页码 3122-3129

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btv330

关键词

-

资金

  1. Stanford Graduate Fellowships Program in Science and Engineering
  2. Basque Government
  3. Center for Science of Information (CSoI)
  4. NSF grant [1157849-1-QAZCC]

向作者/读者索取更多资源

Motivation: Recent advancements in sequencing technology have led to a drastic reduction in the cost of sequencing a genome. This has generated an unprecedented amount of genomic data that must be stored, processed and transmitted. To facilitate this effort, we propose a new lossy compressor for the quality values presented in genomic data files (e.g. FASTQ and SAM files), which comprise roughly half of the storage space (in the uncompressed domain). Lossy compression allows for compression of data beyond its lossless limit. Results: The proposed algorithm QVZ exhibits better rate-distortion performance than the previously proposed algorithms, for several distortion metrics and for the lossless case. Moreover, it allows the user to define any quasi-convex distortion function to be minimized, a feature not supported by the previous algorithms. Finally, we show that QVZ-compressed data exhibit better performance in the genotyping than data compressed with previously proposed algorithms, in the sense that for a similar rate, a genotyping closer to that achieved with the original quality values is obtained.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据