期刊
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022)
卷 -, 期 -, 页码 439-448出版社
IEEE COMPUTER SOC
DOI: 10.1109/WACV51458.2022.00051
关键词
-
We focus on representation learning for large-scale instance-level image retrieval. We propose a global-local attention module (GLAM) that incorporates all forms of attention, including local and global, spatial and channel. By applying these attention mechanisms, we obtain a powerful image representation and achieve state-of-the-art performance on standard benchmarks.
We address representation learning for large-scale instance-level image retrieval. Apart from backbone, training pipelines and loss functions, popular approaches have focused on different spatial pooling and attention mechanisms, which are at the core of learning a powerful global image representation. There are different forms of attention according to the interaction of elements of the feature tensor (local and global) and the dimensions where it is applied (spatial and channel). Unfortunately, each study addresses only one or two forms of attention and applies it to different problems like classification, detection or retrieval. We present global-local attention module (GLAM), which is attached at the end of a backbone network and incorporates all four forms of attention: local and global, spatial and channel. We obtain a new feature tensor and, by spatial pooling, we learn a powerful embedding for image retrieval. Focusing on global descriptors, we provide empirical evidence of the interaction of all forms of attention and improve the state of the art on standard benchmarks.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据