4.7 Article

Extracting data models from background knowledge graphs

期刊

KNOWLEDGE-BASED SYSTEMS
卷 237, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2021.107818

关键词

Knowledge graphs; Ontologies; Data modelling

资金

  1. Science Foundation Ireland (SFI) [SFI/12/RC/2289, SFI/12/RC/2289_P2]
  2. European Regional Development Fund
  3. Fundacao para a Ciencia e a Tecnologia, Portugal through the LASIGE Research Unit [UIDB/00408/2020, UIDP/00408/2020]
  4. Fundação para a Ciência e a Tecnologia [UIDP/00408/2020, UIDB/00408/2020] Funding Source: FCT

向作者/读者索取更多资源

Knowledge Graphs are important in aggregating and publishing knowledge on the Web. This paper proposes the RICDaM framework to facilitate the selection of a data model by generating and ranking candidates that match entity types and properties. Experiments using datasets from the library domain show that this methodology can produce meaningful candidate data models.
Knowledge Graphs have emerged as a core technology to aggregate and publish knowledge on the Web. However, integrating knowledge from different sources, not specifically designed to be interoperable, is not a trivial task. Finding the right ontologies to model a dataset is a challenge since several valid data models exist and there is no clear agreement between them. In this paper, we propose to facilitate the selection of a data model with the RICDaM (Recommending Interoperable and Consistent Data Models) framework. RICDaM generates and ranks candidates that match entity types and properties in an input dataset. These candidates are obtained by aggregating freely available domain RDF datasets in a knowledge graph and then enriching the relationships between the graph's entities. The entity type and object property candidates are obtained by exploiting the instances and structure of this knowledge graph to compute a score that considers both the accuracy and interoperability of the candidates. Datatype properties are predicted with a random forest model, trained on the knowledge graph properties and their values, so to make predictions on candidate properties and rank them according to different measures. We present experiments using multiple datasets from the library domain as a use case and show that our methodology can produce meaningful candidate data models, adaptable to specific scenarios and needs. (c) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据