4.2 Article

Embedding-Based Subsequence Matching in Time-Series Databases

Journal

ACM TRANSACTIONS ON DATABASE SYSTEMS
Volume 36, Issue 3, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2000824.2000827

Keywords

Algorithms; Performance; Theory; Embedding methods; similarity matching; nearest neighbor retrieval; non-Euclidean spaces; nonmetric spaces

Funding

  1. Finnish Centre of Excellence for Algorithmic Data Analysis Research (AlGODAN)
  2. National Science Foundation [IIS-0705749, IIS-0812601, CNS-0923494, IIS-0812309]
  3. UTA
  4. SemsorGrid4Env
  5. MODAP EC
  6. Direct For Computer & Info Scie & Enginr
  7. Div Of Information & Intelligent Systems [0812309] Funding Source: National Science Foundation
  8. Division Of Computer and Network Systems
  9. Direct For Computer & Info Scie & Enginr [0923494] Funding Source: National Science Foundation

Ask authors/readers for more resources

We propose an embedding-based framework for subsequence matching in time-series databases that improves the efficiency of processing subsequence matching queries under the Dynamic Time Warping (DTW) distance measure. This framework partially reduces subsequence matching to vector matching, using an embedding that maps each query sequence to a vector and each database time series into a sequence of vectors. The database embedding is computed offline, as a preprocessing step. At runtime, given a query object, an embedding of that object is computed online. Relatively few areas of interest are efficiently identified in the database sequences by comparing the embedding of the query with the database vectors. Those areas of interest are then fully explored using the exact DTW-based subsequence matching algorithm. We apply the proposed framework to define two specific methods. The first method focuses on time-series subsequence matching under unconstrained Dynamic Time Warping. The second method targets subsequence matching under constrained Dynamic Time Warping (cDTW), where warping paths are not allowed to stray too much off the diagonal. In our experiments, good trade-offs between retrieval accuracy and retrieval efficiency are obtained for both methods, and the results are competitive with respect to current state-of-the-art methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available