4.4 Article

Accelerated butterfly counting with vertex priority on bipartite graphs

Journal

VLDB JOURNAL
Volume 32, Issue 2, Pages 257-281

Publisher

SPRINGER
DOI: 10.1007/s00778-022-00746-0

Keywords

Bipartite graph; Butterfly counting; Dynamic graph

Ask authors/readers for more resources

This paper investigates the problem of efficiently counting butterflies in bipartite graphs and proposes a vertex-priority-based algorithm BFC-VP along with cache-aware strategies for external and parallel contexts. It also tackles the butterfly counting problem on batch-dynamic graphs with fast vertex-priority-based algorithms and optimizations to reduce computation. Extensive empirical studies show that the proposed techniques outperform baseline solutions on real datasets.
Bipartite graphs are of great importance in many real-world applications. Butterfly, which is a complete 2 x 2 biclique, plays a key role in bipartite graphs. In this paper, we investigate the problem of efficient counting the number of butterflies. The most advanced techniques are based on enumerating wedges which is the dominant cost of counting butterflies. Nevertheless, the existing algorithms cannot efficiently handle large-scale bipartite graphs. This becomes a bottleneck in large-scale applications. In this paper, instead of the existing layer-priority-based techniques, we propose a vertex-priority-based paradigm BFC-VP to enumerate much fewer wedges; this leads to a significant improvement of the time complexity of the state-of-the-art algorithms. In addition, we present cache-aware strategies to further improve the time efficiency while theoretically retaining the time complexity of BFC-VP. We also show that our proposed techniques can work efficiently in external and parallel contexts. Moreover, we study the butterfly counting problem on batch-dynamic graphs. Specifically, given a bipartite graph G and a batch-update of edges B, we aim to maintain the number of butterflies in G. To tackle this problem, fast vertex-priority-based algorithms are proposed with optimizations for reducing the computation of existing wedges in G. Our extensive empirical studies demonstrate that the proposed techniques significantly outperform the baseline solutions on real datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available