4.7 Article

The Node-Similarity Distribution of Complex Networks and Its Applications in Link Prediction

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 34, Issue 8, Pages 4011-4023

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2020.3026311

Keywords

Complex networks; Tools; Training; Data models; Predictive models; Measurement; Node-similarity; common neighbor; link prediction; complex networks

Funding

  1. National Natural Science Foundation of China [61971146]
  2. Shanghai Municipal Science and Technology Major Project [2018SHZDZX01]
  3. Open Project of Key Laboratory of Quantum Optics, Chinese Academy of Sciences
  4. ZJLab

Ask authors/readers for more resources

This paper investigates the distribution of node similarity and proposes a measure called common neighbor based similarity (CNS). By using the generating function, a general framework is developed to calculate the CNS distributions of node sets in different networks. The paper also explores the connection between node similarity distribution and link prediction, and provides analytical solutions for two evaluation metrics. Moreover, the paper utilizes similarity distributions to optimize link prediction.
Over the years, quantifying the similarity of nodes has been a hot topic in network science, yet little has been known about the distribution of node-similarity. In this paper, we consider a typical measure of node-similarity called the common neighbor based similarity (CNS). By means of the generating function, we propose a general framework for calculating the CNS distributions of node sets in various networks. Particularly, we show that for the Erdos-Renyi random network, the CNS distribution of node sets of any size obeys the Poisson law. Furthermore, we connect the node-similarity distribution to the link prediction problem, and derive analytical solutions for two key evaluation metrics: i) precision and ii) area under the receiver operating characteristic curve (AUC). We also use the similarity distributions to optimize link prediction by i) deriving the expected prediction accuracy of similarity scores and ii) providing the optimal prediction priority of unconnected node pairs. Simulation results confirm our theoretical findings and also validate the proposed tools in evaluating and optimizing link prediction.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available