4.7 Article

How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Computer Science, Artificial Intelligence

Learning to Prompt for Vision-Language Models

Kaiyang Zhou et al.

Summary: Large pre-trained vision-language models like CLIP have shown potential in transferable representations, but prompt engineering is a major challenge. This study introduces Context Optimization (CoOp), a method that optimizes prompts by learning context word vectors, requiring only one or two shots to surpass hand-crafted prompts and achieving significant improvements over prompt engineering with more shots.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification

Renrui Zhang et al.

Summary: Contrastive Vision-Language Pre-training (CLIP) learns visual representations using large-scale image-text pairs and achieves impressive performance on downstream tasks. To enhance CLIP's adaptability, we propose a training-free adaptation method called Tip-Adapter, which constructs an adapter and updates prior knowledge using feature retrieval.

COMPUTER VISION - ECCV 2022, PT XXXV (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Conditional Prompt Learning for Vision-Language Models

Kaiyang Zhou et al.

Summary: This study investigates methods to adapt pre-trained vision-language models to downstream datasets. The Context Optimization (CoOp) method introduces the concept of prompt learning to adapt pre-trained models, but suffers from overfitting. To address this, the Conditional Context Optimization (CoCoOp) method is proposed, which generates dynamic prompts using a lightweight neural network and achieves better generalization performance.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

ViM: Out-Of-Distribution with Virtual-logit Matching

Haoqi Wang et al.

Summary: Most existing OOD detection algorithms rely on a single input source, which makes them fragile due to the diversity of OOD examples. This paper proposes a novel OOD scoring method called "Virtual-logit Matching" (ViM) that combines class-agnostic scores from the feature space and class-dependent logits from the in-distribution space. Additionally, a new OOD dataset for ImageNet-1K is created to facilitate the evaluation of large-scale OOD detection in academia.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Article Computer Science, Artificial Intelligence

Places: A 10 Million Image Database for Scene Recognition

Bolei Zhou et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Describing Textures in the Wild

Mircea Cimpoi et al.

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2014)