☆ 4.6 Article

MANIACS: Approximate Mining of Frequent Subgraph Patterns through Sampling

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY (2023)

期刊

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY

卷 14, 期 3, 页码 -

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3587254

关键词

Minimum Node Image; pattern mining; VC-dimension

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

We propose a sampling-based randomized algorithm called MANIACS for computing high-quality approximations of frequent subgraph patterns in large vertex-labeled graphs. MANIACS provides strong probabilistic guarantees by using the empirical VC dimension and probabilistic tail bounds. It leverages the MNI frequency properties to aggressively prune the pattern search space, resulting in faster exploration of subspaces without frequent patterns. Experimental evaluation shows that MANIACS returns high-quality collections of frequent patterns in large graphs up to two orders of magnitude faster than the exact algorithm.

We present MANIACS, a sampling-based randomized algorithm for computing high-quality approximations of the collection of the subgraph patterns that are frequent in a single, large, vertex-labeled graph, according to the Minimum Node Image-based (MNI) frequency measure. The output of MANIACS comes with strong probabilistic guarantees, obtained by using the empirical Vapnik-Chervonenkis (VC) dimension, a key concept from statistical learning theory, together with strong probabilistic tail bounds on the difference between the frequency of a pattern in the sample and its exact frequency. MANIACS leverages properties of the MNI-frequency to aggressively prune the pattern search space, and thus to reduce the time spent in exploring subspaces that contain no frequent patterns. In turn, this pruning leads to better bounds to the maximum frequency estimation error, which leads to increased pruning, resulting in a beneficial feedback effect. The results of our experimental evaluation of MANIACS on real graphs show that it returns high-quality collections of frequent patterns in large graphs up to two orders of magnitude faster than the exact algorithm.

MANIACS: Approximate Mining of Frequent Subgraph Patterns through Sampling

期刊

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

MANIACS: Approximate Mining of Frequent Subgraph Patterns through Sampling

期刊

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文