☆ 4.7 Article

Propagation of Data Fusion

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2015)

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

卷 27, 期 5, 页码 1330-1342

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2014.2365807

关键词

Data fusion; referential integrity; set theory

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In a relational database, tuples are called duplicate if they describe the same real-world entity. If such duplicate tuples are observed, it is recommended to remove them and to replace them with one tuple that represents the joint information of the duplicate tuples to a maximal extent. This remove-and-replace operation is called a fusion operation. Within the setting of a relational database management system, the removal of the original duplicate tuples can breach referential integrity. In this paper, a strategy is proposed to maintain referential integrity in a semantically correct manner, thereby optimizing the quality of relationships in the database. An algorithm is proposed that is able to propagate a fusion operation through the entire database. The algorithm is based on a framework of first and second order fusion functions on the one hand, and conflict resolution strategies on the other hand. It is shown how classical strategies for maintaining referential integrity, such as DELETE cascading, are highly specialized cases of the proposed framework. Experimental results are reported that (i) show the efficiency of the proposed algorithm and (ii) show the differences in quality between several second order fusion functions. It is shown that some strategies easily outperform DELETE cascading.

Propagation of Data Fusion

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Propagation of Data Fusion

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文