4.7 Article

Variational autoencoder densified graph attention for fusing synonymous entities: Model and protocol

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 259, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2022.110061

Keywords

Open knowledge graph; Knowledge graph representation; Cluster ranking; Link prediction

Ask authors/readers for more resources

This paper proposes the VGAT model and CR protocol to address the prediction of missing links in open knowledge graphs. The VGAT model automatically mines synonymous features using a variational autoencoder densified graph attention mechanism, while the CR protocol comprehensively evaluates multiple answers from the perspectives of significance and compactness.
The prediction of missing links of open knowledge graphs (OpenKGs) poses unique challenges compared with well-studied curated knowledge graphs (CuratedKGs). Unlike CuratedKGs whose entities are fully disambiguated against a fixed vocabulary, OpenKGs consist of entities represented by non-canonicalized free-form noun phrases and do not require an ontology specification, which drives the synonymity (multiple entities with different surface forms have the same meaning) and sparsity (a large portion of entities with few links). How to capture synonymous features in such sparse situations and how to evaluate the multiple answers pose challenges to existing models and evaluation protocols. In this paper, we propose VGAT, a variational autoencoder densified graph attention model to automatically mine synonymity features, and propose CR, a cluster ranking protocol to evaluate multiple answers in OpenKGs. For the model, VGAT investigates the following key ideas: (1) phrasal synonymity encoder attempts to capture phrasal features, which can automatically make entities with synonymous texts have closer representations; (2) neighbor synonymity encoder mines structural features with a graph attention network, which can recursively make entities with synonymous neighbors closer in representations. (3) densification attempts to densify the OpenKGs by generating similar embeddings and negative samples. For the protocol, CR is designed from the significance and compactness perspectives to comprehensively evaluate multiple answers. Extensive experiments and analysis show the effectiveness of the VGAT model and rationality of the CR protocol. (c) 2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available