4.7 Article

I/O efficient structural clustering and maintenance of clusters for large-scale graphs

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 168, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.114221

关键词

Graph; Structural graph clustering; I/O-efficient algorithm; Cluster maintenance; Dynamic graph

资金

  1. MSIT (Ministry of Science and ICT), Korea, through the NRF [2013M3A9C4078137]
  2. National Research Foundation of Korea - Korea government (MSIT) [2020R1A2C1004032]
  3. National Research Foundation of Korea [2020R1A2C1004032, 2013M3A9C4078137] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

This study introduces an I/O-efficient algorithm for large-scale graph data, pm-SCAN, capable of clustering structures even with limited memory, and proposes a cluster maintenance method for dynamic graph data that shows significant performance improvement compared to traditional methods.
In recent years, the size of graph data has increased significantly, but most existing graph clustering algorithms do not consider the case where the size of main memory is not sufficient to handle large amount of graph data. Exploring entire region of graph for clustering causes too many random disk accesses to use data that are not loaded into memory, resulting in excessive disk I/O and thrashing. To address this problem, we propose an I/O-efficient algorithm for structural clustering of a graph, called pm-SCAN. In the proposed method, if memory is insufficient, an input graph is partitioned into several subgraphs smaller than memory, and clustering is first performed for each subgraph. And then clusters from the subgraphs are merged based on connectivity between clusters so that global results can be obtained in the point of view of an original input graph. Not only does pm SCAN produce scalable performance even for very large graphs, i.e., significant shortage of available memory, but also the result of pm-SCAN is the same as that of the original structural clustering algorithm SCAN. We also propose a cluster maintenance method for large-scale dynamic graphs that change over time. Instead of reclustering with a whole graph, only a small set of nodes whose structural connectivities are subject to change by a given update operation is first identified, and we access only those nodes in disk and update their clusters to reduce maintenance costs. This dynamic graph handling mechanism shows significant performance improvement compared to the existing method and the baseline that performs clustering from scratch.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据