4.6 Article

An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging

Journal

SENSORS
Volume 21, Issue 21, Pages -

Publisher

MDPI
DOI: 10.3390/s21217414

Keywords

positive unlabeled time series classification; self-training; dynamic time warping; DTW barycenter averaging

Funding

  1. National Key Research and Development Program of China [2019YFC1520905]
  2. Zhejiang Provincial Cultural Relics Protection Science and Technology Project [2020010, 2017007]

Ask authors/readers for more resources

This paper focuses on the application of self-training methods in the positive unlabeled time series classification problem, and proposes a new approach called ST-average, which utilizes an average sequence for data labeling that is more representative and reliable compared to traditional methods.
Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positive unlabeled time series classification problem (PUTSC), which refers to automatically labelling the large unlabeled set U based on a small positive labeled set PL. The self-training (ST) is the most widely used method for solving the PUTSC problem and has attracted increased attention due to its simplicity and effectiveness. The existing ST methods simply employ the one-nearest-neighbor (1NN) formula to determine which unlabeled time-series should be labeled. Nevertheless, we note that the 1NN formula might not be optimal for PUTSC tasks because it may be sensitive to the initial labeled data located near the boundary between the positive and negative classes. To overcome this issue, in this paper we propose an exploratory methodology called ST-average. Unlike conventional ST-based approaches, ST-average utilizes the average sequence calculated by DTW barycenter averaging technique to label the data. Compared with any individuals in PL set, the average sequence is more representative. Our proposal is insensitive to the initial labeled data and is more reliable than existing ST-based methods. Besides, we demonstrate that ST-average can naturally be implemented along with many existing techniques used in original ST. Experimental results on public datasets show that ST-average performs better than related popular methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available