☆ 4.7 Article

SwinSUNet: Pure Transformer Network for Remote Sensing Image Change Detection

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (2022)

Journal

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Volume 60, Issue -, Pages -

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TGRS.2022.3160007

Keywords

Transformers; Task analysis; Feature extraction; Merging; Convolution; Decoding; Semantics; Change detection (CD); deep learning; remote sensing image; transformer

Funding

Tianshan Innovation Team of Xinjiang Uygur Autonomous Region [2020D14044]
National Science Foundation of China [U1903213]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This article presents a pure Transformer network called SwinSUNet for remote sensing image change detection. SwinSUNet utilizes the global information extraction ability of Transformers and employs an encoder, fusion module, and decoder to achieve change detection and localization.

Convolutional neural network (CNN) can extract effective semantic features, so it was widely used for remote sensing image change detection (CD) in the latest years. CNN has acquired great achievements in the field of CD, but due to the intrinsic locality of convolution operation, it could not capture global information in space-time. The transformer was proposed in recent years and it can effectively extract global information, so it was used to solve computer vision (CV) tasks and achieved amazing success. In this article, we design a pure transformer network with Siamese U-shaped structure to solve CD problems and name it SwinSUNet. SwinSUNet contains encoder, fusion, and decoder, and all of them use Swin transformer blocks as basic units. Encoder has a Siamese structure based on hierarchical Swin transformer, so encoder can process bitemporal images in parallel and extract their multiscale features. Fusion is mainly responsible for the merge operation of the bitemporal features generated by the encoder. Like encoder, the decoder is also based on hierarchical Swin transformer. Different from the encoder, the decoder uses upsampling and merging (UM) block and Swin transformer blocks to recover the details of the change information. The encoder uses patch merging and Swin transformer blocks to generate effective semantic features. After the sequential process of these three modules, SwinSUNet will output the change maps. We did expensive experiments on four CD datasets, and in these experiments, SwinSUNet achieved better results than other related methods.

SwinSUNet: Pure Transformer Network for Remote Sensing Image Change Detection

Journal

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

SwinSUNet: Pure Transformer Network for Remote Sensing Image Change Detection

Journal

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper