☆ 4.4 Article

Multi-directional feature refinement network for real-time semantic segmentation in urban street scenes

IET COMPUTER VISION (2023)

期刊

IET COMPUTER VISION

卷 17, 期 4, 页码 431-444

出版社

WILEY

DOI: 10.1049/cvi2.12178

关键词

computer vision; image segmentation

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study proposes a network named MRFNet based on two-branch strategy for efficient and accurate semantic segmentation in urban scenes. The network utilizes a Multi-directional Feature Refinement Module (MFRM) to comprehensively consider contextual information from sub-regions in different directions and at different scales. The network also introduces a Feature Cross-guide Aggregation Module to aggregate detailed information and contextual information through mutual guidance.

Efficient and accurate semantic segmentation is crucial for autonomous driving scene parsing. Capturing detailed information and semantic information efficiently through two-branch networks has been widely utilised in real-time semantic segmentation. This study proposes a network named MRFNet based on two-branch strategy to solve the problem of accuracy and speed of segmentation in urban scenes. Many real-time networks do not comprehensively consider contextual information from sub-regions in different directions and at different scales. To handle this problem, a Multi-directional Feature Refinement Module (MFRM) which has three sub-paths to capture information at different scales and directions is proposed. And MFRM reduces computation by using strip pooling and dilated convolution operations. In particular, the authors propose a Feature Cross-guide Aggregation Module to aggregate detailed information and contextual information through the mutual guidance of detailed information and semantic information. This module guides the extraction of feature maps in a more precise direction. Experiments on Cityscapes and CamVid datasets demonstrate the effectiveness of our method by achieving a balance between accuracy and inference speed. Specially, on single 1080Ti GPU, our method yields 78.9% mean intersection over union (mIoU) and 77.4% mIoU at speed of 144.5 frames per second (FPS) and 120.8 FPS on Cityscapes and CamVid datasets respectively.

Multi-directional feature refinement network for real-time semantic segmentation in urban street scenes

期刊

IET COMPUTER VISION

出版社

WILEY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-directional feature refinement network for real-time semantic segmentation in urban street scenes

期刊

IET COMPUTER VISION

出版社

WILEY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文