4.7 Article

DML-PL: Deep metric learning based pseudo-labeling framework for class imbalanced semi-supervised learning

期刊

INFORMATION SCIENCES
卷 626, 期 -, 页码 641-657

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2023.01.074

关键词

Class imbalanced classification; Semi -supervised learning; Deep metric learning

向作者/读者索取更多资源

This study proposes a deep metric learning-based pseudo-labeling (DML-PL) framework that addresses both class imbalance and insufficient labeled data problems. Through an iterative self-training strategy, a deep metric network is trained to learn compact feature representations of labeled and unlabeled data, generating reliable pseudo-labels and improving training accuracy.
Traditional class imbalanced learning algorithms require training data to be labeled, whereas semi-supervised learning algorithms assume that the class distribution is balanced. However, class imbalance and insufficient labeled data problems often coexist in practical real-world applications. Currently, most existing class-imbalanced semi-supervised learning methods tackle these two problems separately, resulting in the trained model biased towards majority classes that have more data samples. In this study, we propose a deep metric learning based pseudo-labeling (DML-PL) framework that tackles both problems simultaneously for class-imbalanced semi-supervised learning. The proposed DML-PL framework comprises three modules: Deep Metric Learning, Pseudo-Labeling and Network Fine-tuning. An iterative self-training strategy is used to train the model multiple times. For each time of training, Deep Metric Learning trains a deep metric network to learn compact feature representations of labeled and unlabeled data. PseudoLabeling then generates reliable pseudo-labels for unlabeled data through labeled data clustering with nearest neighbors selection. Finally, Network Fine-tuning fine-tunes the deep metric network to generate better pseudo-labels in the subsequent training. The training ends when all the unlabeled data are pseudo-labeled. The proposed framework achieved state-of-the-art performance on the long-tailed CIFAR-10, CIFAR-100, and ImageNet127 benchmark datasets compared with baseline models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据