4.5 Article

Computing Graph Descriptors on Edge Streams

Journal

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3591468

Keywords

Graph descriptor; edge stream; graph classification

Ask authors/readers for more resources

Feature extraction is crucial for graph analytics as it provides graph descriptors used in downstream analysis models. Existing algorithms for computing these descriptors cannot handle large graphs due to memory constraints and lack of user control over runtime. This article presents streaming algorithms that approximates three graph descriptors and eliminates the need to store the entire graph in memory. The proposed descriptors demonstrate high accuracy and can be computed within minutes even for graphs with millions of edges, using only 25% of the memory.
Feature extraction is an essential task in graph analytics. These feature vectors, called graph descriptors, are used in downstream vector-space-based graph analysis models. This idea has proved fruitful in the past, with spectral-based graph descriptors providing state-of-the-art classification accuracy. However, known algorithms to compute meaningful descriptors do not scale to large graphs since: (1) they require storing the entire graph in memory, and (2) the end-user has no control over the algorithm's runtime. In this article, we present streaming algorithms to approximately compute three different graph descriptors capturing the essential structure of graphs. Operating on edge streams allows us to avoid storing the entire graph in memory, and controlling the sample size enables us to keep the runtime of our algorithms within desired bounds. We demonstrate the efficacy of the proposed descriptors by analyzing the approximation error and classification accuracy. Our scalable algorithms compute descriptors of graphs with millions of edges within minutes. Moreover, these descriptors yield predictive accuracy comparable to the state-of-the-art methods but can be computed using only 25% as much memory.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available