4.7 Article

From Less to More: Progressive Generalized Zero-Shot Detection With Curriculum Learning

Journal

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Volume 23, Issue 10, Pages 19016-19029

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TITS.2022.3151073

Keywords

Task analysis; Visualization; Generators; Training; Object detection; Semantics; Proposals; Object detection; generalized zero-shot detection (GZSD); curriculum learning; generative adversarial network (GAN)

Funding

  1. National Natural Science Foundation of China [61872187, 62072246]
  2. Natural Science Foundation of Jiangsu Province [BK20201306]

Ask authors/readers for more resources

Object detection is one of the most important tasks for environment perception in intelligent transportation systems. Most existing research focuses on the fully supervised scenario, which can lead to model failure. Zero-shot learning models have the ability to detect unseen objects. However, generative models generally perform better than visual-semantic mapping methods in Generalized Zero-Shot Detection (GZSD). In order to overcome this limitation in generative methods, we propose using curriculum learning to generate more precise unseen visual features. Our method shows superior performance compared to state-of-the-art methods.
Object detection, as one of the most important environment perception tasks for traffic safety in intelligent transportation systems, has been widely investigated recently. However, most of the researches focus on the fully supervised scenario, and inevitably lead to model failure. With the continuous development of Zero-Shot Learning (ZSL) models, Generalized Zero-Shot Detection (GZSD) has attracted great attention due to its ability of detecting unseen objects. Many researchers tend to map the detected visual features to semantic attributes and then separate seen and unseen domains during inference. But they have ignore that the generative methods generally have higher performance than these visual-semantic mapping methods, and they have been confirmed from previous GZSL methods. In order to make up for the vacancy of GZSD in the generative methods, we propose an idea of using curriculum learning to generate more precise unseen visual features. And with the excellent performance of WGAN-based method in sample synthesis, we realize the function of using semantics to generate visual features for unseen domains. In addition, we also adopt part of the idea of meta-learning to progressively correct the capability of the generator for better mitigating domain shift problem during the generation process. Through the above ideas, we can detect both seen and unseen bounding boxes and classify them accurately, by combining with the excellent detection ability of Faster-RCNN. Extensive experimental results on two popular datasets, i.e., MSCOCO and KITTI, show that our proposed method can outperform the state-of-the-art methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available