☆ 4.2 Article

On the Complexity of String Matching for Graphs

ACM TRANSACTIONS ON ALGORITHMS (2023)

期刊

ACM TRANSACTIONS ON ALGORITHMS

卷 19, 期 3, 页码 -

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3588334

关键词

Exact pattern matching; graph query; graph search; labeled graphs; string matching; string search; Strong Exponential Time Hypothesis; heterogeneous networks; variation graphs

类别

Computer Science, Theory & Methods Mathematics, Applied

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Exact string matching in labeled graphs is a fundamental problem that is at the core of various applications in computational biology, graph databases, and heterogeneous networks. We provide a conditional lower bound that proves the hardness of achieving efficient exact string matching algorithms in graphs, unless certain computational assumptions are false. Our results hold even for restricted cases, such as undirected graphs and directed acyclic graphs, making the lower bound stricter than previous bounds derived from regular expression matching. Additionally, our findings suggest that exact and approximate matching have the same level of difficulty in graphs under certain assumptions, while in strings, they have different time complexities.

Exact string matching in labeled graphs is the problem of searching paths of a graph G = (V, E) such that the concatenation of their node labels is equal to a given pattern string P[1..m]. This basic problem can be found at the heart of more complex operations on variation graphs in computational biology, of query operations in graph databases, and of analysis operations in heterogeneous networks. We prove a conditional lower bound stating that, for any constant epsilon > 0, an O(|E|(1-epsilon) m) time, or an O(|E| m(1-epsilon)) time algorithm for exact string matching in graphs, with node labels and pattern drawn from a binary alphabet, cannot be achieved unless the Strong Exponential Time Hypothesis (SETH) is false. This holds even if restricted to undirected graphs with maximum node degree 2-that is, to zig-zag matching in bidirectional strings, or to deterministic directed acyclic graphs whose nodes have maximum sum of indegree and outdegree 3. These restricted cases make the lower bound stricter than what can be directly derived from related bounds on regular expression matching (Backurs and Indyk, FOCS'16). In fact, our bounds are tight in the sense that lowering the degree or the alphabet size yields linear time solvable problems. An interesting corollary is that exact and approximate matching are equally hard (i.e., quadratic time) in graphs under SETH. In comparison, the same problems restricted to strings have linear time vs quadratic time solutions, respectively (approximate pattern matching having also a matching SETH lower bound (Backurs and Indyk, STOC'15)).

On the Complexity of String Matching for Graphs

期刊

ACM TRANSACTIONS ON ALGORITHMS

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

On the Complexity of String Matching for Graphs

期刊

ACM TRANSACTIONS ON ALGORITHMS

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文