期刊
IEEE ROBOTICS AND AUTOMATION LETTERS
卷 8, 期 8, 页码 5084-5091出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LRA.2023.3290420
关键词
Object detection; Adaptation models; deep learning for visual perception; visual learning
类别
Unsupervised Domain Adaptive Object Detection (UDA-OD) uses unlabelled data to improve the reliability of robotic vision systems in open-world environments. We propose a framework that explicitly addresses class distribution shift to improve pseudo-label reliability in self-training. Our method utilizes the domain invariance and contextual understanding of a pre-trained joint vision and language model to predict the class distribution of unlabelled data, and adjusts the pseudo-labels based on this prediction, providing weak supervision of pseudo-label accuracy. Additionally, we dynamically adjust the number of pseudo-labels per image based on model confidence to account for low quality pseudo-labels early in self-training. Our method outperforms state-of-the-art approaches on multiple benchmarks, including a 4.7 mAP improvement when facing challenging class distribution shift.
Unsupervised Domain Adaptive Object Detection (UDA-OD) uses unlabelled data to improve the reliability of robotic vision systems in open-world environments. Previous approaches to UDA-OD based on self-training have been effective in overcoming changes in the general appearance of images. However, shifts in a robot's deployment environment can also impact the likelihood that different objects will occur, termed class distribution shift. Motivated by this, we propose a framework for explicitly addressing class distribution shift to improve pseudo-label reliability in self-training. Our approach uses the domain invariance and contextual understanding of a pre-trained joint vision and language model to predict the class distribution of unlabelled data. By aligning the class distribution of pseudo-labels with this prediction, we provide weak supervision of pseudo-label accuracy. To further account for low quality pseudo-labels early in self-training, we propose an approach to dynamically adjust the number of pseudo-labels per image based on model confidence. Our method outperforms state-of-the-art approaches on several benchmarks, including a 4.7 mAP improvement when facing challenging class distribution shift.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据