4.6 Article

Deep semi-supervised clustering for multi-variate time-series

Journal

NEUROCOMPUTING
Volume 516, Issue -, Pages 36-47

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2022.10.033

Keywords

Multi-variate time series; Clustering; Semi-Supervised; Constrained clustering

Ask authors/readers for more resources

A huge amount of data from sensors can be organized as multivariate time series. In a limited background knowledge setting, semi-supervised clustering methods can effectively utilize a small amount of knowledge. We propose a constrained deep embedding time series clustering framework that manages the temporal dimension and exploits Must-link and Cannot-link constraints for better clustering results.
Huge amount of data are nowadays produced by a large and disparate family of sensors, which typically measure multiple variables over time. Such rich information can be profitably organized as multivariate time-series. Collect enough labelled samples to set up supervised analysis for such kind of data is chal-lenging while a reasonable assumption is to dispose of a limited background knowledge that can be injected in the analysis process. In this context, semi-supervised clustering methods represent a well sui-ted tool to get the most out of such reduced amount of knowledge. With the aim to deal with multivariate time-series analysis under a limited background knowledge setting, we propose a semi-supervised (con-strained) deep embedding time-series clustering framework that exploits knowledge supervision mod-eled as Must-and Cannot-link constraints. More in detail, our proposal, named conDetSEC (constrained Deep embedding time SEries Clustering), is based on Gated Recurrent Units (GRUs) with the aim to explicitly manage the temporal dimension associated to multi-variate time series data. conDetSEC implements a procedure in which an embedding generation step is combined with a cluster-ing refinement step. Both steps exploit the small amount of available knowledge provided by Must-and Cannot-link constraints. More specifically, during the data embedding generation the constraints are used by jointly optimizing the network parameters via both unsupervised and semi-supervised tasks, while at the refinement step they are used in conjunction with the goal to stretch the embedding man-ifold towards the clustering centroids to recover a more clear cluster structure. Experimental evaluation on real-world benchmarks coming from diverse domains has highlighted the effectiveness of our pro-posal in comparison with state-of-the-art unsupervised and semi-supervised time-series clustering methods. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available