4.7 Article

Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation

期刊

NEURAL NETWORKS
卷 137, 期 -, 页码 188-199

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2021.01.021

关键词

Semantic segmentation; Real time; Deep learning; Attention mechanism

资金

  1. National Natural Science Foun-dation of China [61773295]
  2. Key Research and Development Program of Hubei Province, China [2020BAB113]
  3. Natural Science Foundation of Hubei Province, China [2019CFA037]

向作者/读者索取更多资源

A lightweight bilateral attention decoder is proposed for real-time semantic segmentation in this paper, which improves spatial accuracy and efficiency through information refinement and fusion. The proposed method achieves better performance with higher inference speed compared to other state-of-the-art real-time and non-real-time semantic segmentation methods.
The encoder-decoder structure has been introduced into semantic segmentation to improve the spatial accuracy of the network by fusing high- and low-level feature maps. However, recent state-of-the-art encoder-decoder-based methods can hardly attain the real-time requirement due to their complex and inefficient decoders. To address this issue, in this paper, we propose a lightweight bilateral attention decoder for real-time semantic segmentation. It consists of two blocks and can fuse different level feature maps via two steps, i.e., information refinement and information fusion. In the first step, we propose a channel attention branch to refine the high-level feature maps and a spatial attention branch for the low-level ones. The refined high-level feature maps can capture more exact semantic information and the refined low-level ones can capture more accurate spatial information, which significantly improves the information capturing ability of these feature maps. In the second step, we develop a new fusion module named pooling fusing block to fuse the refined high- and low-level feature maps. This fusion block can take full advantages of the high- and low-level feature maps, leading to high-quality fusion results. To verify the efficiency of the proposed bilateral attention decoder, we adopt a lightweight network as the backbone and compare our proposed method with other state-of-the-art real-time semantic segmentation methods on the Cityscapes and Camvid datasets. Experimental results demonstrate that our proposed method can achieve better performance with a higher inference speed. Moreover, we compare our proposed network with several state-of-the-art non-real-time semantic segmentation methods and find that our proposed network can also attain better segmentation performance. (C) 2021 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据