4.7 Article

CS-HSNet: A Cross-Siamese Change Detection Network Based on Hierarchical-Split Attention

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSTARS.2021.3113831

Keywords

Feature extraction; Task analysis; Semantics; Remote sensing; Computational efficiency; Transformers; Licenses; Attention mechanism; hierarchical-split structure; image change detection; multiscale features

Funding

  1. Shenzhen Science and Technology Program [KQTD20190929172704911]

Ask authors/readers for more resources

In this work, a new siamese change detection feature encoder backbone named CSRes2Net is proposed, which represents dual features in a fine-grained manner and captures cross-dimensional long-range relationships using a lightweight cross spatial-channel triplet attention module. Additionally, a hierarchical-split block is introduced for generating multiscale feature representations in a coarse-to-fine fashion. The experiments results on LEVIR-CD and season-varying change detection dataset demonstrate superior performance compared to most state-of-the-art models.
Change detection methods for optical remote sensing images play an important role in environmental resource management. Although recent methods based on deep learning demonstrate incredible ability by constructing networks, first, extracting bitemporal features in a separate manner; second, fusing bitemporal images before forwarding them into the single-level network. Both severely neglect the effect of spatial-temporal feature correlation between bitemporal images. In addition, most existing methods represent multiscale feature pairs in a layer-wise manner like ResNet, failing to consider the inner multilevel structure. In this work, we propose a new siamese change detection feature encoder backbone named cross-siamese Res2Net (CSRes2Net), by establishing crossed and hierarchical residual-like connections within one single residual block. The CSRes2Net represents dual features in a fine-grained manner and fully leads to the flow of bitemporal features. In addition, recent learning-based methods designed some spatial-temporal relation modules to capture the pixel-level pairwise relationship and channel dependency based on self-attention mechanism, but they only consider spatial and channel dimension corrections separately with excessive parameters. So we propose a lightweight cross spatial-channel triplet attention module to capture cross-dimensional long-range relationship between triplet combinations: channel with height, channel with width, channel with channel. Finally, we propose a hierarchical-split block for generating multiscale feature representations in a coarse-to-fine fashion. The experiments results on LEVIR-CD and season-varying change detection dataset outperform most state-of-the-art models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available