☆ 4.6 Article

Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network

VISUAL COMPUTER (2022)

期刊

VISUAL COMPUTER

卷 38, 期 7, 页码 2473-2488

出版社

SPRINGER

DOI: 10.1007/s00371-021-02124-3

关键词

Semantic segmentation; Sparse Attention Module; Class attention features; Multi-task

类别

Computer Science, Software Engineering

资金

Fundamental Research Funds for the Central Universities [JUSRP41908]
National Natural Science Foundation of China [61201429, 61362030]
China Postdoctoral Science Foundation [2015M581720, 2016M600360]
Jiangsu Postdoctoral Science Foundation [1601216C]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper proposes a Sparse Attention Model combined with a powerful multi-task feature extraction network to reduce computing resource consumption in semantic segmentation. By using a Class Attention Module, the model ensures that query vectors capture dense contextual information efficiently.

In the task of semantic segmentation, researchers often use self-attention module to capture long-range contextual information. These methods are often effective. However, the use of the self-attention module will cause a problem that cannot be ignored, that is, the huge consumption of computing resources. Therefore, how to reduce the resource consumption of the self-attention module under the premise of ensuring performance is a very meaningful research topic. In this paper, we propose a Sparse Attention Model combined with a powerful multi-task feature extraction network for semantic segmentation. Compared with the classic self-attention model, our Sparse Attention Model does not calculate the inner product between pairs of all vectors. Instead, we first sparse the feature block Query and the feature block Key defined in self-attention module through the credit matrix generated by the pre-output. Then, we perform similarity modeling on the two sparse feature blocks. Meanwhile, to ensure that the vectors in Query could capture dense contextual information, we design a Class Attention Module and embed it into Sparse Attention Module. Note that, compared with Dual Attention Network for scene segmentation, our attention module greatly reduces the consumption of computing resources while ensuring the accuracy. Furthermore, in the stage of feature extraction, the use of downsampling will cause serious loss of detailed information and affect the segmentation performance of the network, so we adopt a multi-task feature extraction network. It learns semantic features and edge features in parallel, and we feed the learned edge features into the deep layer of the network to help restore detailed information for capturing high-quality semantic features. We do not use pure concatenation. Instead, we extract the edge features related to each channel by element-wise multiplication before concatenation. Finally, we conduct experiments on three datasets: Cityscapes, PASCAL VOC2012 and ADE20K, and obtain competitive results.

Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network

期刊

VISUAL COMPUTER

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Sparse Attention Module for optimizing semantic segmentation performance combined with a multi-task feature extraction network

期刊

VISUAL COMPUTER

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文