3.8 Proceedings Paper

How Linked Data can Aid Machine Learning-Based Tasks

出版社

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-319-67008-9_13

关键词

Linked Data; Machine Learning; Feature Discovery & Selection; Automatic classification; Prediction

资金

  1. European Union's Horizon 2020 Research and Innovation programme under the BlueBRIDGE project [675680]

向作者/读者索取更多资源

The discovery of useful data for a given problem is of primary importance since data scientists usually spend a lot of time for discovering, collecting and preparing data before using them for various reasons, e.g., for applying or testing machine learning algorithms. In this paper we propose a general method for discovering, creating and selecting, in an easy way, valuable features describing a set of entities for leveraging them in a machine learning context. We demonstrate the feasibility of this approach by introducing a tool (research prototype), called LODsyndesis(ML), which is based on Linked Data technologies, that (a) discovers automatically datasets where the entities of interest occur, (b) shows to the user a big number of useful features for these entities, and (c) creates automatically the selected features by sending SPARQL queries. We evaluate this approach by exploiting data from several sources, including British National Library, for creating datasets in order to predict whether a book or a movie is popular or non-popular. Our evaluation contains a 5-fold cross validation and we introduce comparative results for a number of different features and models. The evaluation showed that the additional features did improve the accuracy of prediction.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据