☆ 3.8 Proceedings Paper

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images

COMPUTER VISION, ECCV 2022, PT IX (2022)

期刊

COMPUTER VISION, ECCV 2022, PT IX

卷 13669, 期 -, 页码 701-717

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

DOI: 10.1007/978-3-031-20077-9_41

关键词

类别

Computer Science, Artificial Intelligence Imaging Science & Photographic Technology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations. The proposed approach includes a two-stage openvocabulary object detector, regional prompt learning to align visual and textual embeddings, and a self-training framework using online resources. The proposed detector, PromptDet, outperforms existing approaches with fewer training images and no manual annotations.

The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations. To achieve that, we make the following four contributions: (i) in pursuit of generalisation, we propose a two-stage openvocabulary object detector, where the class-agnostic object proposals are classified with a text encoder from pre-trained visual-language model; (ii) To pair the visual latent space (of RPN box proposals) with that of the pre-trained text encoder, we propose the idea of regional prompt learning to align the textual embedding space with regional visual object features; (iii) To scale up the learning procedure towards detecting a wider spectrum of objects, we exploit the available online resource via a novel self-training framework, which allows to train the proposed detector on a large corpus of noisy uncurated web images. Lastly, (iv) to evaluate our proposed detector, termed as PromptDet, we conduct extensive experiments on the challenging LVIS and MS-COCO dataset. PromptDet shows superior performance over existing approaches with fewer additional training images and zero manual annotations whatsoever. Project page with code: https://fcjian.github.io/promptdet.

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images

期刊

COMPUTER VISION, ECCV 2022, PT IX

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images

期刊

COMPUTER VISION, ECCV 2022, PT IX

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文