4.7 Article

CS-HSNet: A Cross-Siamese Change Detection Network Based on Hierarchical-Split Attention

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSTARS.2021.3113831

关键词

Feature extraction; Task analysis; Semantics; Remote sensing; Computational efficiency; Transformers; Licenses; Attention mechanism; hierarchical-split structure; image change detection; multiscale features

资金

  1. Shenzhen Science and Technology Program [KQTD20190929172704911]

向作者/读者索取更多资源

In this work, a new siamese change detection feature encoder backbone named CSRes2Net is proposed, which represents dual features in a fine-grained manner and captures cross-dimensional long-range relationships using a lightweight cross spatial-channel triplet attention module. Additionally, a hierarchical-split block is introduced for generating multiscale feature representations in a coarse-to-fine fashion. The experiments results on LEVIR-CD and season-varying change detection dataset demonstrate superior performance compared to most state-of-the-art models.
Change detection methods for optical remote sensing images play an important role in environmental resource management. Although recent methods based on deep learning demonstrate incredible ability by constructing networks, first, extracting bitemporal features in a separate manner; second, fusing bitemporal images before forwarding them into the single-level network. Both severely neglect the effect of spatial-temporal feature correlation between bitemporal images. In addition, most existing methods represent multiscale feature pairs in a layer-wise manner like ResNet, failing to consider the inner multilevel structure. In this work, we propose a new siamese change detection feature encoder backbone named cross-siamese Res2Net (CSRes2Net), by establishing crossed and hierarchical residual-like connections within one single residual block. The CSRes2Net represents dual features in a fine-grained manner and fully leads to the flow of bitemporal features. In addition, recent learning-based methods designed some spatial-temporal relation modules to capture the pixel-level pairwise relationship and channel dependency based on self-attention mechanism, but they only consider spatial and channel dimension corrections separately with excessive parameters. So we propose a lightweight cross spatial-channel triplet attention module to capture cross-dimensional long-range relationship between triplet combinations: channel with height, channel with width, channel with channel. Finally, we propose a hierarchical-split block for generating multiscale feature representations in a coarse-to-fine fashion. The experiments results on LEVIR-CD and season-varying change detection dataset outperform most state-of-the-art models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据