☆ 4.4 Article

Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads

PROCEEDINGS OF THE VLDB ENDOWMENT (2020)

Journal

PROCEEDINGS OF THE VLDB ENDOWMENT

Volume 14, Issue 2, Pages 74-86

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.14778/3425879.3425880

Keywords

Funding

Google
Intel
Microsoft, MIT Data Systems
AI Lab (DSAIL) at MIT
NSF [IIS 1900933]
DARPA Award [16-43-D3M-FP040]
MIT Air Force Artificial Intelligence Innovation Accelerator (AIIA)
United States Air Force Research Laboratory
[FA8750-19-2-1000]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Filtering data based on predicates is one of the most fundamental operations for any modern data warehouse. Techniques to accelerate the execution of filter expressions include clustered indexes, specialized sort orders (e.g., Z-order), multi-dimensional indexes, and, for high selectivity queries, secondary indexes. However, these schemes are hard to tune and their performance is inconsistent. Recent work on learned multi-dimensional indexes has introduced the idea of automatically optimizing an index for a particular dataset and workload. However, the performance of that work suffers in the presence of correlated data and skewed query workloads, both of which are common in real applications. In this paper, we introduce Tsunami, which addresses these limitations to achieve up to 6x faster query performance and up to 8x smaller index size than existing learned multi-dimensional indexes, in addition to up to 11x faster query performance and 170x smaller index size than optimally-tuned traditional indexes.

Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads

Journal

PROCEEDINGS OF THE VLDB ENDOWMENT

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads

Journal

PROCEEDINGS OF THE VLDB ENDOWMENT

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper