期刊
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
卷 35, 期 5, 页码 4570-4584出版社
IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2022.3150427
关键词
Random walk; online social networks; graph sampling
Random walk is a popular method for sampling large-scale graphs, but it suffers from slow convergence. To address this issue, we propose a common neighbor aware random walk framework called CNARW, which speeds up convergence by considering common neighbors. We also design efficient unbiased sampling schemes and variant algorithms to reduce cost and improve convergence. Experimental results show that our approach outperforms state-of-the-art random walk sampling algorithms in terms of convergence speed and query cost reduction. Two case studies demonstrate the effectiveness of our sampling framework in solving large-scale graph analysis tasks.
Random walk is widely applied to sample large-scale graphs due to its simplicity of implementation and solid theoretical foundations of bias analysis. However, its computational efficiency is heavily limited by the slow convergence rate (a.k.a. long burn-in period). To address this issue, we propose a common neighbor aware random walk framework called CNARW, which leverages weighted walking by differentiating the next-hop candidate nodes to speed up the convergence. Specifically, CNARW takes into consideration the common neighbors between previously visited nodes and next-hop candidate nodes in each walking step. Based on CNARW, we further develop two efficient unbiased sampling schemes, and we also design two variant algorithms which can reduce sampling cost and speed up the convergence. Experimental results on real-world network datasets show that our approach converges remarkably faster than the state-of-the-art random walk sampling algorithms; and to achieve the same estimation accuracy, our approach reduces the query cost significantly. Last, we use two case studies to demonstrate the effectiveness of our sampling framework in solving large-scale graph analysis tasks.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据