4.6 Article

Polarized self-attention: Towards high-quality pixel-wise mapping

期刊

NEUROCOMPUTING
卷 506, 期 -, 页码 158-167

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2022.07.054

关键词

Pixel-wise mapping; Self-attention; Polarization; Convolution

向作者/读者索取更多资源

This study addresses the pixel-wise mapping problem in fine-grained computer vision tasks and proposes a Polarized Self-Attention (PSA) block for high-quality pixel-wise mapping. The PSA block effectively handles long-range dependencies and nonlinear outputs with low computation overheads. Experimental results show that PSA significantly improves the performance of standard baselines and state-of-the-art techniques.
We address the pixel-wise mapping problem that commonly exists in the fine-grained computer vision tasks, such as estimating keypoint heatmaps and segmentation masks. These tasks require, at low computation overheads, modeling the long-range dependencies among high-resolution inputs and estimating the highly nonlinear pixel-wise outputs. While the attention mechanism added to Deep Convolutional Neural Networks (DCNNs) can boost long-range dependencies, the element-specific attention, such as the Nonlocal block, is highly complex and noise-sensitive to learn, and most of the simplified attention blocks are designed for image-wise classification purposes and simply applied to pixel-wise tasks. In this paper, we present the Polarized Self-Attention (PSA) block targeting the high-quality pixel-wise mapping with: (1) Polarized filtering: keeping high internal resolution in both channel and spatial attention computation while completely collapsing input tensors along their counterpart dimensions. (2) Enhancement: composing non-linearity that directly fits the output distribution of typical pixel-wise mappings, such as the 2D Gaussian distribution (keypoint heatmaps), or the 2D Binormial distribution (binary segmentation masks). Experimental results show that PSA boosts standard baselines by 2-4 points, and boosts state-of-the-arts by 1-2 points on 2D pose estimation and semantic segmentation benchmarks. Codes are available at ( https://github.com/DeLightCMU/PSA). (c) 2022 Published by Elsevier B.V.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据