4.7 Article

A distributed model for sampling large scale social networks

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 186, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.115773

关键词

Social networks; Graph sampling; MapReduce paradigm; Degree centrality

向作者/读者索取更多资源

With the rapidly increasing amount of data in social networks, analyzing social networks content has become more challenging. In order to reduce the network's size and preserve the original network's properties, methods such as graph coarsening and graph sampling are gaining attention in the scientific community.
Social networks content analysis has become more challenging over the years due to the rapidly increasing amount of data. Real social networks are omnipresent in everyday life, which makes the structure of the generated data more complex. A key task in social networks analysis is to reduce the network's size and to produce an approximate representation that preserves the original network's properties. This task is known asgraph's reduction and is gaining increasing attention in the scientific community. A review of literature reveals diverse methods to address this task. Some of them are based on graph coarsening and are developed to cope with the problem of communities detection. Others are part of graph sampling and are designed to reduce the graph's size while preserving its structure, which is our purpose. In this paper, we put forth a distributed model called DGS Distributed Graph Sampling to generate a sample in a distributed way. The idea behind distributing our model is to cope with large scale social networks. In effect, our model is based on the MapReduce framework that allows to access simultaneously to several data segments for the calculation during the sampling strategy. The main task of our model is to use a new centrality measure based on the degree centrality to sample the graph. We evaluate the performance and the scalability of our DGS model using real world social networks. In this paper, we will compare our proposed model to four well-known sampling strategies in order to demonstrate its efficiency to preserve the original network's structure.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据