☆ 4.7 Article

CHTKC: a robust and efficient k-mer counting algorithm based on a lock-free chaining hash table

BRIEFINGS IN BIOINFORMATICS (2021)

期刊

BRIEFINGS IN BIOINFORMATICS

卷 22, 期 3, 页码 -

出版社

OXFORD UNIV PRESS

DOI: 10.1093/bib/bbaa063

关键词

assembly; DNA-seq; hash table; sequence analysis; k-mer counting; algorithm

类别

Biochemical Research Methods Mathematical & Computational Biology

资金

National Natural Science Foundation of China [61771165]
International Postdoctoral Exchange Fellowship [20130053]
China Postdoctoral Science Foundation [2018T110302, 2014M551246]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper proposes a new method called CHTKC to efficiently calculate the frequency of each substring of length k in DNA sequences, using a lock-free hash table and linked lists to resolve collisions and optimize memory usage. Thorough testing on multiple datasets shows that using a hash-table-based method remains a feasible solution for the k-mer counting problem.

Motivation: Calculating the frequency of occurrence of each substring of length k in DNA sequences is a common task in many bioinformatics applications, including genome assembly, error correction, and sequence alignment. Although the problem is simple, efficient counting of datasets with high sequencing depth or large genome size is a challenge. Results: We propose a robust and efficient method, CHTKC, to solve the k-mer counting problem with a lock-free hash table that uses linked lists to resolve collisions. We also design new mechanisms to optimize memory usage and handle situations where memory is not enough to accommodate all k-mers. CHTKC has been thoroughly tested on seven datasets under multiple memory usage scenarios and compared with Jellyfish2 and KMC3. Our work shows that using a hash-table-based method to effectively solve the k-mer counting problem remains a feasible solution.

CHTKC: a robust and efficient k-mer counting algorithm based on a lock-free chaining hash table

期刊

BRIEFINGS IN BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

CHTKC: a robust and efficient k-mer counting algorithm based on a lock-free chaining hash table

期刊

BRIEFINGS IN BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文