4.3 Article

Inducing enhanced suffix arrays for string collections

Journal

THEORETICAL COMPUTER SCIENCE
Volume 678, Issue -, Pages 22-39

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.tcs.2017.03.039

Keywords

Data structures; Suffix array; LCP array; Document array; String collections

Funding

  1. CAPES
  2. CNPq [162338/2015-5]

Ask authors/readers for more resources

Constructing the suffix array for a string collection is an important task that may be performed by sorting the concatenation of all strings. In this article we present algorithms gSAIS and gSACA-K, which extend SAIS (Nong et al., 2011) [8] and SACA-IC (Nong, 2013) [10] to construct the suffix array for a string collection maintaining their theoretical bounds, respecting the order among all suffixes, and improving their practical performance. gSACA-K is an optimal suffix array construction algorithm for string collections from constant alphabets. Moreover, we show how to modify gSAIS and gSACA-K to also compute the longest common prefix array and the document array as a byproduct of suffix sorting, with the same theoretical bounds. We performed experiments that showed that our algorithms have an outstanding practical performance. (C) 2017 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available