Journal
INFORMATION SCIENCES
Volume 405, Issue -, Pages 207-226Publisher
ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2017.04.028
Keywords
Data management; Probabilistic reverse top-k queries; Probabilistic skyline queries; Probabilistic top-l influential queries; Uncertain databases
Categories
Funding
- Key Program of National Natural Science Foundation of China [61432005]
- National Outstanding Youth Science Program of National Natural Science Foundation of China [61625202]
- International (Regional) Cooperation and Exchange Program of National Natural Science Foundation of China [61661146006]
- National Natural Science Foundation of China [61370095, 61472124]
- International Science & Technology Cooperation Program of China [2015DFA11240, 2014DFB30010]
- National High-tech R&D Program of China [2015AA015305]
- Hunan Provincial Innovation Foundation For Postgraduate [CX2016B066]
- Outstanding Graduate Student Innovation Fund Program of Collaborative Innovation Center of High Performance Computing
Ask authors/readers for more resources
Reverse top-k queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse top-k queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to process uncertain data sets directly. Motivated by this, in this paper, we firstly model the probabilistic reverse top-k queries over uncertain data. Moreover, we formulate a probabilistic top-l influential query, that reports the 1 most influential objects having the largest impact factors, where the impact factor of an object is defined as the cardinality of its probabilistic reverse top-k query result set. We present effective pruning heuristics for speeding up the queries. Particularly, we exploit several properties of probabilistic threshold top-k queries and probabilistic skyline queries to reduce the search space of this problem. In addition, an upper bound of the potential users is estimated to reduce the cost of computing the probabilistic reverse top-k queries for the candidate objects. Finally, efficient query algorithms are presented seamlessly with integration of the proposed pruning strategies. Extensive experiments using both real-world and synthetic data sets demonstrate the efficiency and effectiveness of our proposed algorithms. (C) 2017 Elsevier Inc. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available