☆ 3.8 Proceedings Paper

TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (2018)

Journal

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA

Volume -, Issue -, Pages 441-456

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3183713.3196920

Keywords

Personalized PageRank; Top-k Queries

Funding

National Natural Science Foundation of China [61502503]
National Key Basic Research Program (973 Program) of China [2014CB340403]
Fundamental Research Funds for the Central Universities
Research Funds of Renmin University of China [18XNLG21]
MOE, Singapore [MOE2015-T2-2-069]
NUS, Singapore under an SUG

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Personalized PageRank (PPR) is a classic metric that measures the relevance of graph nodes with respect to a source node. Given a graph G, a source node s, and a parameter k, a top-k PPR query returns a set of k nodes with the highest PPR values with respect to s. This type of queries serves as an important building block for numerous applications in web search and social networks, such as Twitter's Who-To-Follow recommendation service. Existing techniques for top-k PPR, however, suffer from two major deficiencies. First, they either incur prohibitive space and time overheads on large graphs, or fail to provide any guarantee on the precision of top-k results (i.e., the results returned might miss a number of actual top-k answers). Second, most of them require significant pre-computation on the input graph G, which renders them unsuitable for graphs with frequent updates (e.g., Twitter's social graph). To address the deficiencies of existing solutions, we propose TopPPR, an algorithm for top-k PPR queries that ensure at least rho precision (i.e., at least rho fraction of the actual top-k results are returned) with at least 1-1/n probability, where rho is an element of (0, 1] is a user-specified parameter and n is the number of nodes in G. In addition, TopPPR offers non-trivial guarantees on query time in terms of rho, and it can easily handle dynamic graphs as it does not require any preprocessing. We experimentally evaluate TopPPR using a variety of benchmark datasets, and demonstrate that TopPPR outperforms the state-of-the-art solutions in terms of both efficiency and precision, even when we set rho = 1 (i.e., when TopPPR returns the exact top-k results). Notably, on a billion-edge Twitter graph, TopPPR only requires 15 seconds to answer a top-500 PPR query with rho = 1.

TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs

Journal

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs

Journal

SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper