4.4 Article

Towards certain fixes with editing rules and master data

期刊

VLDB JOURNAL
卷 21, 期 2, 页码 213-238

出版社

SPRINGER
DOI: 10.1007/s00778-011-0253-7

关键词

Certain fix; Editing rule; Master data; Data cleaning; Data quality

资金

  1. RSE-NSFC
  2. IBM
  3. National Basic Research Program of China (973 Program) [2012CB316200]
  4. NGFR [973 2011CB302602]
  5. NSFC [90818028, 60903149]
  6. Engineering and Physical Sciences Research Council [EP/H008063/1, EP/E029213/1] Funding Source: researchfish
  7. EPSRC [EP/H008063/1, EP/E029213/1] Funding Source: UKRI

向作者/读者索取更多资源

A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find certain fixes that are guaranteed correct, and worse still, may even introduce new errors when attempting to repair the data. We propose a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment. We also develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they are able to fix all the attributes in a tuple, relative to master data and a certain region. Furthermore, we present a framework and an algorithm to find certain fixes, by interacting with the users to ensure that one of the certain regions is correct. We experimentally verify the effectiveness and scalability of the algorithm.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据