4.7 Article

Finding Route Hotspots in Large Labeled Networks

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 33, Issue 6, Pages 2479-2492

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2019.2956924

Keywords

Computer hacking; Communication networks; Collaboration; Indexes; Trojan horses; Feature extraction; Graphs; sequential pattern; community detection; indexing

Funding

  1. National Key Research and Development Program of China [2017YFB0803301]
  2. Natural Science Foundation of China [61976026, U1836215]
  3. 111 Project [B18008]
  4. US National Science Foundation [III-1526499, III-1763325, III-1909323, CNS-1930941]

Ask authors/readers for more resources

This article explores the problem of finding route hotspots in large labeled networks, proposes a scalable algorithm FastRH, and designs an RH-Index index structure for storing hotspot and pattern information. Experimental results demonstrate the effectiveness and scalability of these methods on real-world datasets.
In many advanced network analysis applications, like social networks, e-commerce, and network security, hotspots are generally considered as a group of vertices that are tightly connected owing to the similar characteristics, such as common habits and location proximity. In this article, we investigate the formation of hotspots from an alternative perspective that considers the routes along the network paths as the auxiliary information, and attempt to find the route hotspots in large labeled networks. A route hotspot is a cohesive subgraph that is covered by a set of routes, and these routes correspond to the same sequential pattern consisting of vertices' labels. To the best of our knowledge, the problem of Finding Route Hotspots in Large Labeled Networks has not been tackled in the literature. However, it is challenging as counting the number of hotspots in a network is #P-hard. Inspired by the observation that the sizes of hotspots decrease with the increasing lengths of patterns, we prove several anti-monotonicity properties of hotspots, and then develop a scalable algorithm called FastRH that can use these properties to effectively prune the patterns that cannot form any hotspots. In addition, to avoid the duplicate computation overhead, we judiciously design an effective index structure called RH-Index for storing the hotspot and pattern information collectively, which also enables incremental updating and efficient query processing. Our experimental results on real-world datasets clearly demonstrate the effectiveness and scalability of our proposed methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available