☆ 3.8 Proceedings Paper

RegionCLIP: Region-based Language-Image Pretraining

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Learning to Generate Scene Graph from Natural Language Supervision

Yiwu Zhong et al.

Summary: This paper introduces a method that learns from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as a scene graph. By leveraging an off-the-shelf object detector and designing a Transformer-based model to predict pseudo labels, the model achieves strong results for weakly and fully supervised scene graph generation tasks. The experiment results show a 30% relative gain over the latest method trained with human-annotated unlocalized scene graphs.

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) (2021)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

PreDet: Large-scale weakly supervised pre-training for detection

Vignesh Ramanathan et al.

Summary: This study introduces a new large-scale pre-training strategy for object detection, augmenting standard classification pre-training by introducing noisy class labels and a detection-specific pretext task. By redesigning Faster R-CNN modules to efficiently perform this task, significant improvements over existing weakly-supervised and self-supervised pre-training approaches in detection accuracy and fine-tuning speed were shown.

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence