☆ 4.7 Article

DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes

EXPERT SYSTEMS WITH APPLICATIONS (2021)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 183, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2021.115090

关键词

Street scene understanding; Deep learning; Lightweight convolutional; Neural network; Real-time semantic segmentation; Spatial attention

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Efficient and accurate semantic segmentation is crucial in scene understanding for autonomous driving. This paper introduces a computationally efficient network named DSANet, which utilizes a two-branch strategy to address real-time semantic segmentation in urban scenes. The proposed method achieves high segmentation accuracy while improving inference speed through semantic encoding, dual attention modules, and spatial encoding network.

Efficient and accurate semantic segmentation is particularly important in scene understanding for autonomous driving. Although Deep Convolutional Neural Networks(DCNNs) approaches have made a significant improvement for semantic segmentation. However, state-of-the-art models such as Deeplab and PSPNet have complex architectures and high computation complexity. Thus, it is inefficient for realtime applications. On the other hand, many works compromise the performance to obtain real-time inference speed which is critical for developing a light network model with high segmentation accuracy. In this paper, we present a computationally efficient network named DSANet, which follows a two-branch strategy to tackle the problem of real-time semantic segmentation in urban scenes. We first design a Semantic Encoding Branch, which employs channel split and shuffle to reduce the computation and maintain higher segmentation accuracy. Also, we propose a dual attention module consisting of dilated spatial attention and channel attention to make full use of the multi-level feature maps simultaneously, which helps predict the pixel-wise labels in each stage. Meanwhile, Spatial Encoding Network is used to enhance semantic information and preserve the spatial details. To better combine context information and spatial information, we introduce a Simple Feature Fusion Module. We evaluated our model with state-of-the-art semantic image semantic segmentation methods using two challenging datasets. The proposed method achieves an accuracy of 69.9% mean IoU and 71.3% mean IoU at speed of 75.3 fps and 34.08 fps on CamVid and Cityscapes test datasets respectively.

DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文