4.6 Article

On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs

期刊

APPLIED SCIENCES-BASEL
卷 11, 期 1, 页码 -

出版社

MDPI
DOI: 10.3390/app11010162

关键词

graph embedding; feature representation learning; link-based similarity measures; node-pairs similarity

资金

  1. National Research Foundation of Korea (NRF) - Korea government (MSIT) [NRF-2020R1A2B5B03001960]
  2. National Research Foundation of Korea (NRF) - Korea government (MSIT) through the Life Basic Research Program [NRF-2019R1G1A1007598]
  3. Next-Generation Information Computing Development Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT [NRF-2017M3C4A7069440]

向作者/读者索取更多资源

This study investigates the effectiveness and efficiency of graph embedding methods in the task of node similarity computation by comparing them with similarity measures. The results show that, in most datasets, embedding methods are less effective and efficient than similarity measures, and require more time-consuming parameter tuning due to having more parameters. Additionally, increasing the number of dimensions does not necessarily improve the effectiveness of embedding methods in computing node similarity.
One of the important tasks in a graph is to compute the similarity between two nodes; link-based similarity measures (in short, similarity measures) are well-known and conventional techniques for this task that exploit the relations between nodes (i.e., links) in the graph. Graph embedding methods (in short, embedding methods) convert nodes in a graph into vectors in a low-dimensional space by preserving social relations among nodes in the original graph. Instead of applying a similarity measure to the graph to compute the similarity between nodes a and b, we can consider the proximity between corresponding vectors of a and b obtained by an embedding method as the similarity between a and b. Although embedding methods have been analyzed in a wide range of machine learning tasks such as link prediction and node classification, they are not investigated in terms of similarity computation of nodes. In this paper, we investigate both effectiveness and efficiency of embedding methods in the task of similarity computation of nodes by comparing them with those of similarity measures. To the best of our knowledge, this is the first work that examines the application of embedding methods in this special task. Based on the results of our extensive experiments with five well-known and publicly available datasets, we found the following observations for embedding methods: (1) with all datasets, they show less effectiveness than similarity measures except for one dataset, (2) they underperform similarity measures with all datasets in terms of efficiency except for one dataset, (3) they have more parameters than similarity measures, thereby leading to a time-consuming parameter tuning process, (4) increasing the number of dimensions does not necessarily improve their effectiveness in computing the similarity of nodes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据