4.5 Article

RNACache: A scalable approach to rapid transcriptomic read mapping using locality sensitive hashing

期刊

JOURNAL OF COMPUTATIONAL SCIENCE
卷 60, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.jocs.2022.101572

关键词

Bioinformatics; Next-generation sequencing; RNA-seq; Transcriptomics; Read mapping; Hashing; Parallelism; Big data

资金

  1. RMU Initiative Funding for Research by the Rhine-Main Universities (Johannes Gutenberg University Mainz) as part of the project RMU Network for Deep Continuous-Discrete Machine Learning (DeCoDeML)
  2. RMU Initiative Funding for Research by the Rhine-Main Universities (Goethe University Frankfurt) as part of the project RMU Network for Deep Continuous-Discrete Machine Learning (DeCoDeML)
  3. RMU Initiative Funding for Research by the Rhine-Main Universities (TU Darmstadt) as part of the project RMU Network for Deep Continuous-Discrete Machine Learning (DeCoDeML)

向作者/读者索取更多资源

RNACache is a novel approach based on context-aware locality sensitive hashing for detecting local similarities between transcriptomes and RNA-seq reads. It consists of a three-step processing pipeline that accurately identifies truly expressed transcript isoforms and offers better performance and scalability compared to other lightweight mapping tools.
Mapping of reads to transcriptomes is a crucial initial step for bioinformatics RNA-seq pipelines. As alignment-based methods exhibit high computational complexities, lightweight alignment-free methods are becoming increasingly important. We present RNACache - a novel approach to the detection of local similarities between transcriptomes and RNA-seq reads based on context-aware locality sensitive hashing. It consists of a three-step processing pipeline consisting of subsampling of k-mers, match-based (online) filtering, and coverage-based filtering in order to identify truly expressed transcript isoforms. Our performance evaluation shows that RNACache produces transcriptomic mappings of high accuracy that include significantly fewer erroneous matches compared to the state-of-the-art lightweight mappers RapMap, Salmon, and Kallisto. Furthermore, it offers good scalability in terms of number of utilized CPU cores and has the best runtime performance at low memory consumption on modern multi-core workstations. This is an extended version of our previously published conference paper (Cascitti et al., 2021). RNACache is available at https://github.com/jcasc/rnacache.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据