4.2 Article

Attribute and Object Selection Queries on Objects with Probabilistic Attributes

Journal

ACM TRANSACTIONS ON DATABASE SYSTEMS
Volume 37, Issue 1, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2109196.2109199

Keywords

Algorithms; Experimentation; Measurement; Performance; Reliability; Attribute value selection; object selection query; result quality; F-measure; probabilistic data

Funding

  1. NSF [CNS-1118114]
  2. DARPA [HR0011-11-C-0017]
  3. Direct For Computer & Info Scie & Enginr [1059436] Funding Source: National Science Foundation
  4. Division Of Computer and Network Systems [1059436] Funding Source: National Science Foundation
  5. Div Of Information & Intelligent Systems
  6. Direct For Computer & Info Scie & Enginr [1118114] Funding Source: National Science Foundation

Ask authors/readers for more resources

Modern data processing techniques such as entity resolution, data cleaning, information extraction, and automated tagging often produce results consisting of objects whose attributes may contain uncertainty. This uncertainty is frequently captured in the form of a set of multiple mutually exclusive value choices for each uncertain attribute along with a measure of probability for alternative values. However, the lay end-user, as well as some end-applications, might not be able to interpret the results if outputted in such a form. Thus, the question is how to present such results to the user in practice, for example, to support attribute-value selection and object selection queries the user might be interested in. Specifically, in this article we study the problem of maximizing the quality of these selection queries on top of such a probabilistic representation. The quality is measured using the standard and commonly used set-based quality metrics. We formalize the problem and then develop efficient approaches that provide high-quality answers for these queries. The comprehensive empirical evaluation over three different domains demonstrates the advantage of our approach over existing techniques.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available