4.7 Article

WideSegNeXt: Semantic Image Segmentation Using Wide Residual Network and NeXt Dilated Unit

Journal

IEEE SENSORS JOURNAL
Volume 21, Issue 10, Pages 11427-11434

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSEN.2020.3008908

Keywords

Image segmentation; Semantics; Feature extraction; Task analysis; Convolution; Spatial resolution; Image segmentation; machine vision; image processing

Funding

  1. Leading Initiative for Excellent Young Researchers of the Ministry of Education, Culture, Sports, Science, and Technology, Japan [16809746]
  2. State Key Laboratory of Marine Geology in Tongji University

Ask authors/readers for more resources

This paper proposed a new architecture called WideSegNeXt to address the shortcomings of the FCN in semantic segmentation, which can capture image context on various spatial scales, effectively identify small objects, and reduce the loss of position information. The proposed method achieved a mean intersection over union (MIoU) of 72.5% and a global accuracy (GA) of 92.4% on the CamVid dataset without the need for additional input datasets, outperforming previous methods.
Semantic segmentation is widely applied in autonomous driving, in robotic picking, and for medical purposes. Due to the breakthrough of deep learning in recent years, the fully convolutional network (FCN)-based method has become the de facto standard in semantic segmentation. However, the simple FCN has difficulty in capturing global context information, since the local receptive field is small. Furthermore, there is a problem of low image resolution because of the existence of the pooling layer. In this paper, we address the shortcomings of the FCN by proposing a new architecture called WideSegNeXt, which captures the image context on various spatial scales and is effective in identifying small objects. In addition, there is little loss of position information, since there are no pooling layers in the structure. The proposed method achieves a mean intersection over union (MIoU) of 72.5% and a global accuracy (GA) of 92.4% on the CamVid dataset and achieves higher performance than previous methods without additional input datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available