4.2 Article

Wander Join and XDB: Online Aggregation via Random Walks

Journal

ACM TRANSACTIONS ON DATABASE SYSTEMS
Volume 44, Issue 1, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3284551

Keywords

Join; online aggregation; random walk

Funding

  1. NSF [1251019, 1302663, 1443046]
  2. NSFC [61428204]
  3. HKRGC [GRF-621413, GRF-16211614, GRF-16200415]
  4. Direct For Computer & Info Scie & Enginr
  5. Division Of Computer and Network Systems [1302663] Funding Source: National Science Foundation
  6. Div Of Information & Intelligent Systems
  7. Direct For Computer & Info Scie & Enginr [1251019] Funding Source: National Science Foundation

Ask authors/readers for more resources

Joins are expensive, and online aggregation over joins was proposed to mitigate the cost, which offers users a nice and flexible tradeoff between query efficiency and accuracy in a continuous, online fashion. However, the state-of-the-art approach, in both internal and external memory, is based on ripple join, which is still very expensive and even needs unrealistic assumptions (e.g., tuples in a table are stored in random order). This article proposes a new approach, the wander join algorithm, to the online aggregation problem by performing random walks over the underlying join graph. We also design an optimizer that chooses the optimal plan for conducting the random walks without having to collect any statistics a priori. Compared with ripple join, wander join is particularly efficient for equality joins involving multiple tables, but also supports theta-joins. Selection predicates and group-by clauses can be handled as well. To demonstrate the usefulness of wander join, we have designed and implemented XDB (approXimate DB) by integrating wander join into various systems including PostgreSQL, Spark, and a stand-alone plug-in version using PL/SQL. The design and implementation of XDB has demonstrated wander join's practicality in a full-fledged database system. Extensive experiments using the TPC-H benchmark have demonstrated the superior performance of wander join over ripple join.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available