4.7 Article

RENANO: a REference-based compressor for NANOpore FASTQ files

期刊

BIOINFORMATICS
卷 37, 期 24, 页码 4862-4864

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btab437

关键词

-

资金

  1. CSIC, Universidad de la Republica, PEDECIBA
  2. Gipuzkoa Fellows grant
  3. Ramon y Cajal grant from Spain

向作者/读者索取更多资源

RENANO is a reference-based lossless data compressor specifically designed for FASTQ files generated with nanopore sequencing technologies, outperforming its predecessor ENANO in base call sequence compression. It shows significant improvements in performance in two different scenarios and overall file compression rates compared to ENANO.
Motivation: Nanopore sequencing technologies are rapidly gaining popularity, in part, due to the massive amounts of genomic data they produce in short periods of time (up to 8.5 TB of data in <72 h). To reduce the costs of transmission and storage, efficient compression methods for this type of data are needed. Results: We introduce RENANO, a reference-based lossless data compressor specifically tailored to FASTQ files generated with nanopore sequencing technologies. RENANO improves on its predecessor ENANO, currently the state of the art, by providing a more efficient base call sequence compression component. Two compression algorithms are introduced, corresponding to the following scenarios: (1) a reference genome is available without cost to both the compressor and the decompressor and (2) the reference genome is available only on the compressor side, and a compacted version of the reference is included in the compressed file. We compare the compression performance of RENANO against ENANO on several publicly available nanopore datasets. RENANO improves the base call sequences compression of ENANO by 39.8% in scenario (1), and by 33.5% in scenario (2), on average, over all the datasets. As for total file compression, the average improvements are 12.7% and 10.6%, respectively. We also show that RENANO consistently outperforms the recent general-purpose genomic compressor Genozip.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据