4.7 Article

Variable-Length Subsequence Clustering in Time Series

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2020.2986965

Keywords

Time series analysis; Data mining; Optimization; Clustering algorithms; Clustering methods; Adaptation models; Feature extraction; Time series data mining; subsequence clustering; variable-length patterns; time series segmentation

Funding

  1. National Natural Science Foundation of China [61901454, 61971404, 61501434]
  2. Youth Innovation Promotion Association CAS [2019168]
  3. Foundation of key Laboratory of Space Utilization, Technology and Engineering Center for Space utilization Chinese Academy of Sciences [CSU-QZKT-2018-08]

Ask authors/readers for more resources

This paper proposes an optimization framework for adaptively estimating the lengths and representations of different patterns in subsequence clustering. By minimizing the errors in subsequence clustering and segmentation under time series cover constraint, our framework can automatically extract unknown variable-length subsequence clusters in time series.
Subsequence clustering is an important issue in time series data mining. Observing that most time series consist of various patterns with different unknown lengths, we propose an optimization framework to adaptively estimate the lengths and representations for different patterns. Our framework minimizes the inner subsequence cluster errors with respect to subsequence clusters and segmentation under time series cover constraint where the subsequence cluster lengths can be variable. To optimize our framework, we first generate abundant initial subsequence clusters with different lengths. Then, three cluster operations, i.e., cluster splitting, combination and removing, are used to iteratively refine the cluster lengths and representations by respectively splitting clusters consisting of different patterns, joining neighboring clusters belonging to the same pattern and removing clusters to the predefined cluster number. During each cluster refinement, we employ an efficient algorithm to alternatively optimize subsequence clusters and segmentation based on dynamic programming. Our method can automatically and efficiently extract the unknown variable-length subsequence clusters in the time series. Comparative results with the state-of-the-art are conducted on various synthetic and real time series, and quantitative and qualitative performances demonstrate the effectiveness of our method.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available