4.7 Article

Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries

Journal

INFORMATION SCIENCES
Volume 405, Issue -, Pages 207-226

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2017.04.028

Keywords

Data management; Probabilistic reverse top-k queries; Probabilistic skyline queries; Probabilistic top-l influential queries; Uncertain databases

Funding

  1. Key Program of National Natural Science Foundation of China [61432005]
  2. National Outstanding Youth Science Program of National Natural Science Foundation of China [61625202]
  3. International (Regional) Cooperation and Exchange Program of National Natural Science Foundation of China [61661146006]
  4. National Natural Science Foundation of China [61370095, 61472124]
  5. International Science & Technology Cooperation Program of China [2015DFA11240, 2014DFB30010]
  6. National High-tech R&D Program of China [2015AA015305]
  7. Hunan Provincial Innovation Foundation For Postgraduate [CX2016B066]
  8. Outstanding Graduate Student Innovation Fund Program of Collaborative Innovation Center of High Performance Computing

Ask authors/readers for more resources

Reverse top-k queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse top-k queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to process uncertain data sets directly. Motivated by this, in this paper, we firstly model the probabilistic reverse top-k queries over uncertain data. Moreover, we formulate a probabilistic top-l influential query, that reports the 1 most influential objects having the largest impact factors, where the impact factor of an object is defined as the cardinality of its probabilistic reverse top-k query result set. We present effective pruning heuristics for speeding up the queries. Particularly, we exploit several properties of probabilistic threshold top-k queries and probabilistic skyline queries to reduce the search space of this problem. In addition, an upper bound of the potential users is estimated to reduce the cost of computing the probabilistic reverse top-k queries for the candidate objects. Finally, efficient query algorithms are presented seamlessly with integration of the proposed pruning strategies. Extensive experiments using both real-world and synthetic data sets demonstrate the efficiency and effectiveness of our proposed algorithms. (C) 2017 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available