4.7 Article

Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge Graphs

Journal

MATHEMATICS
Volume 11, Issue 23, Pages -

Publisher

MDPI
DOI: 10.3390/math11234806

Keywords

conceptual distance; shortest path algorithms; accelerated calculation; computational complexity

Categories

Ask authors/readers for more resources

Conceptual distance refers to the proximity between two concepts within a conceptualization, which is measured based on semantic similarity and relationships. DIS-C is a method that computes semantic similarity/relationships by propagating local distances using an All Pairs Shortest Path (APSP) algorithm. This paper explores different alternatives to improve DIS-C, focusing on reducing computational complexity for analyzing large graphs. The results indicate that a simplified version of DIS-C, based on centrality estimation, reduces processing time by 2.381 times compared to the original version.
Conceptual distance refers to the degree of proximity between two concepts within a conceptualization. It is closely related to semantic similarity and relationships, but its measurement strongly depends on the context of the given concepts. DIS-C represents an advancement in the computation of semantic similarity/relationships that is independent of the type of knowledge structure and semantic relations when generating a graph from a knowledge base (ontologies, semantic networks, and hierarchies, among others). This approach determines the semantic similarity between two indirectly connected concepts in an ontology by propagating local distances by applying an algorithm based on the All Pairs Shortest Path (APSP) problem. This process is implemented for each pair of concepts to establish the most effective and efficient paths to connect these concepts. The algorithm identifies the shortest path between concepts, which allows for an inference of the most relevant relationships between them. However, one of the critical issues with this process is computational complexity, combined with the design of APSP algorithms, such as Dijkstra, which is O(n(3)). This paper studies different alternatives to improve the DIS-C approach by adapting approximation algorithms, focusing on Dijkstra, pruned Dijkstra, and sketch-based methods, to compute the conceptual distance according to the need to scale DIS-C to analyze very large graphs; therefore, reducing the related computational complexity is critical. Tests were performed using different datasets to calculate the conceptual distance when using the original version of DIS-C and when using the influence area of nodes. In situations where time optimization is necessary for generating results, using the original DIS-C model is not the optimal method. Therefore, we propose a simplified version of DIS-C to calculate conceptual distances based on centrality estimation. The obtained results for the simple version of DIS-C indicated that the processing time decreased 2.381 times when compared to the original DIS-C version. Additionally, for both versions of DIS-C (normal and simple), the APSP algorithm decreased the computational cost when using a two-hop coverage-based approach.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available