4.5 Article

The wavelet matrix: An efficient wavelet tree for large alphabets

Journal

INFORMATION SYSTEMS
Volume 47, Issue -, Pages 15-32

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.is.2014.06.002

Keywords

Succinct data structures; Compressed sequence representations

Funding

  1. CONICYT Fondecyt Iniciacion [11130104]
  2. Millennium Nucleus Information and Coordination in Networks ICM/FIC, Chile [P10-024F]
  3. CDTI [CDTI EXP 000645663/ITC-20133062]
  4. Ministerio de Economia y Competitividad-MEC- [CDTI EXP 000645663/ITC-20133062]
  5. Axencia Galega de Innovacion -AGI- [CDTI EXP 000645663/ITC-20133062]
  6. Xunta de Galicia - (FEDER) [GRC2013/053]
  7. MICINN (PGE)
  8. MICINN (FEDER) [TIN2009-14560-C03-02, TIN2010-21246-C02-01]
  9. FPU Program [AP2010-6038]

Ask authors/readers for more resources

The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size sigma, within compressed space and supporting a wide range of operations on S. When sigma is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zero-order entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map. (C) 2014 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available