4.7 Article

Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation

Journal

NEURAL NETWORKS
Volume 137, Issue -, Pages 188-199

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2021.01.021

Keywords

Semantic segmentation; Real time; Deep learning; Attention mechanism

Funding

  1. National Natural Science Foun-dation of China [61773295]
  2. Key Research and Development Program of Hubei Province, China [2020BAB113]
  3. Natural Science Foundation of Hubei Province, China [2019CFA037]

Ask authors/readers for more resources

A lightweight bilateral attention decoder is proposed for real-time semantic segmentation in this paper, which improves spatial accuracy and efficiency through information refinement and fusion. The proposed method achieves better performance with higher inference speed compared to other state-of-the-art real-time and non-real-time semantic segmentation methods.
The encoder-decoder structure has been introduced into semantic segmentation to improve the spatial accuracy of the network by fusing high- and low-level feature maps. However, recent state-of-the-art encoder-decoder-based methods can hardly attain the real-time requirement due to their complex and inefficient decoders. To address this issue, in this paper, we propose a lightweight bilateral attention decoder for real-time semantic segmentation. It consists of two blocks and can fuse different level feature maps via two steps, i.e., information refinement and information fusion. In the first step, we propose a channel attention branch to refine the high-level feature maps and a spatial attention branch for the low-level ones. The refined high-level feature maps can capture more exact semantic information and the refined low-level ones can capture more accurate spatial information, which significantly improves the information capturing ability of these feature maps. In the second step, we develop a new fusion module named pooling fusing block to fuse the refined high- and low-level feature maps. This fusion block can take full advantages of the high- and low-level feature maps, leading to high-quality fusion results. To verify the efficiency of the proposed bilateral attention decoder, we adopt a lightweight network as the backbone and compare our proposed method with other state-of-the-art real-time semantic segmentation methods on the Cityscapes and Camvid datasets. Experimental results demonstrate that our proposed method can achieve better performance with a higher inference speed. Moreover, we compare our proposed network with several state-of-the-art non-real-time semantic segmentation methods and find that our proposed network can also attain better segmentation performance. (C) 2021 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available