4.7 Article

They are Not Completely Useless: Towards Recycling Transferable Unlabeled Data for Class-Mismatched Semi-Supervised Learning

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 25, Issue -, Pages 1844-1857

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2022.3179895

Keywords

Semi-supervised learning; class mismatch; domain adaptation

Ask authors/readers for more resources

Semi-Supervised Learning with mismatched classes refers to the problem where the classes-of-interest in labeled data are only a subset of the classes in unlabeled data. To address this, recent methods divide unlabeled data into useful in-distribution (ID) data and harmful out-of-distribution (OOD) data. However, they overlook the potential value of OOD data. Thus, this paper proposes a Transferable OOD data Recycling (TOOR) method to properly utilize both ID data and recyclable OOD data for improved SSL performance.
Semi-Supervised Learning (SSL) with mismatched classes deals with the problem that the classes-of-interests in the limited labeled data are only a subset of the classes in massive unlabeled data. As a result, classical SSL methods would be misled by the classes which are only possessed by the unlabeled data. To solve this problem, some recent methods divide unlabeled data to useful in-distribution (ID) data and harmful out-of-distribution (OOD) data, among which the latter should particularly be weakened. As a result, the potential value contained by OOD data is largely overlooked. To remedy this defect, this paper proposes a Transferable OOD data Recycling (TOOR) method which properly utilizes ID data as well as the recyclable OOD data to enrich the information for conducting class-mismatched SSL. Specifically, TOOR treats the OOD data that have a close relationship with ID data and labeled data as recyclable, and employs adversarial domain adaptation to project them to the space of ID data and labeled data. In other words, the recyclability of an OOD datum is evaluated by its transferability, and the recyclable OOD data are transferred so that they are compatible with the distribution of known classes-of-interests. Consequently, our TOOR extracts more information from unlabeled data than existing methods, so it achieves an improved performance which is demonstrated by the experiments on typical benchmark datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available