☆ 4.7 Article

Tunnel: Parallel-inducing sort for large string analytics

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2023)

期刊

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

卷 149, 期 -, 页码 650-663

出版社

ELSEVIER

DOI: 10.1016/j.future.2023.08.009

关键词

Suffix array; String algorithm; Parallel sorting; String analysis

类别

Computer Science, Theory & Methods

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper presents the Tunnel algorithm, the first large-scale parallel suffix array construction algorithm with a time complexity of O based on the PRAM model. The algorithm divides the problem into sub-problems, efficiently induces the order of suffixes with long common prefixes, and transforms a partially ordered suffix set into a total order relation. The Tunnel algorithm is scalable and suitable for large string analytics on parallel systems.

The suffix array is a crucial data structure for efficient string analysis. Over the course of twenty-six years, sequential suffix array construction algorithms have achieved O(n) time complexity and inplace sorting. In this paper, we present the Tunnel algorithm, the first large-scale parallel suffix array (n ) construction algorithm with a time complexity of O based on the parallel random access machine (PRAM) model. The Tunnel algorithm is built on three key ideas: dividing the problem of size O(n) into p sub-problems of reduced size O by replacing long suffixes with shorter prefixes of size at most a constant D; introducing a Tunnel mechanism to efficiently induce the order of a set of suffixes with long common prefixes; developing a strategy to transform a partially ordered suffix set into a total order relation by iteratively applying the Tunnel inducing method. We provide a detailed description of the algorithm, along with a thorough analysis of its time and space complexity, to demonstrate its correctness and efficiency. The proposed Tunnel algorithm exhibits scalable performance, making it suitable for large string analytics on large-scale parallel systems. & COPY; 2023 Elsevier B.V. All rights reserved.

Tunnel: Parallel-inducing sort for large string analytics

期刊

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Tunnel: Parallel-inducing sort for large string analytics

期刊

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文