4.8 Article

CoLoRd: compressing long reads

期刊

NATURE METHODS
卷 19, 期 4, 页码 441-+

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41592-022-01432-3

关键词

-

资金

  1. National Science Centre, Poland [DEC-2019/33/B/ST6/02040]
  2. US National Institutes of Health [R01HG010040, U01HG010971, U41HG010972]

向作者/读者索取更多资源

The cost of maintaining a large amount of data generated by third-generation sequencing has become a significant concern in genomic research. Existing algorithms for compressing long reads have only a slight advantage over general-purpose gzip. In this study, we introduce CoLoRd, an algorithm that can significantly reduce the size of third-generation sequencing data without compromising the accuracy of downstream analyses.
The cost of maintaining exabytes of data produced by sequencing experiments every year has become a major issue in today's genomic research. In spite of the increasing popularity of third-generation sequencing, the existing algorithms for compressing long reads exhibit a minor advantage over the general-purpose gzip. We present CoLoRd, an algorithm able to reduce the size of third-generation sequencing data by an order of magnitude without affecting the accuracy of downstream analyses. CoLoRd achieves high compression rates for long-read sequencing data without affecting downstream analyses.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据