4.7 Article

MSST-Net: A Multi-Scale Adaptive Network for Building Extraction from Remote Sensing Images Based on Swin Transformer

Journal

REMOTE SENSING
Volume 13, Issue 23, Pages -

Publisher

MDPI
DOI: 10.3390/rs13234743

Keywords

deep learning; remote sensing; transformer; semantic segmentation; multi-scale adaptive

Ask authors/readers for more resources

The proposed multi-scale adaptive segmentation network model based on Swin Transformer (MSST-Net) addresses the limitation of convolutional neural networks in capturing global features by utilizing the self-attention mechanism. By using Swin Transformer to encode input images, decoding feature maps of different levels separately, fusing with convolution, and adjusting channels with a 1 x 1 kernel for final prediction map generation, the network model improves evaluation metrics on a WHU building dataset. This model emphasizes global features for remote sensing segmentation.
The segmentation of remote sensing images by deep learning technology is the main method for remote sensing image interpretation. However, the segmentation model based on a convolutional neural network cannot capture the global features very well. A transformer, whose self-attention mechanism can supply each pixel with a global feature, makes up for the deficiency of the convolutional neural network. Therefore, a multi-scale adaptive segmentation network model (MSST-Net) based on a Swin Transformer is proposed in this paper. Firstly, a Swin Transformer is used as the backbone to encode the input image. Then, the feature maps of different levels are decoded separately. Thirdly, the convolution is used for fusion, so that the network can automatically learn the weight of the decoding results of each level. Finally, we adjust the channels to obtain the final prediction map by using the convolution with a kernel of 1 x 1. By comparing this with other segmentation network models on a WHU building data set, the evaluation metrics, mIoU, F1-score and accuracy are all improved. The network model proposed in this paper is a multi-scale adaptive network model that pays more attention to the global features for remote sensing segmentation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available