4.6 Article

TransCD: scene change detection via transformer-based architecture

期刊

OPTICS EXPRESS
卷 29, 期 25, 页码 41409-41427

出版社

OPTICAL SOC AMER
DOI: 10.1364/OE.440720

关键词

-

类别

资金

  1. Science and Technology Program of Sichuan [2021YJ0080]
  2. National Natural Science Foundation of China [61771409]

向作者/读者索取更多资源

Scene change detection aims to identify changes between bi-temporal images acquired at different times, while overcoming noisy changes induced by camera motion or environment variation. A transformer-based SCD architecture incorporating a siamese vision transformer has been proposed to establish global semantic relations and model long-range context, outperforming state-of-the-art models with improved efficiency and performance.
Scene change detection (SCD) is a task to identify changes of interest between bi-temporal images acquired at different times. A critical idea of SCD is how to identify interesting changes while overcoming noisy changes induced by camera motion or environment variation, such as viewpoint, dynamic changes and outdoor conditions. The noisy changes cause corresponding pixel pairs to have spatial difference (position relation) and temporal difference (intensity relation). Due to the limitation of local receptive field, it is difficult for traditional models based on convolutional neural network (CNN) to establish long-range relations for the semantic changes. In order to address the above challenges, we explore the potential of a transformer in SCD and propose a transformer-based SCD architecture (TransCD). From the intuition that a SCD model should be able to model both interesting and noisy changes, we incorporate a siamese vision transformer (SViT) in a feature difference SCD framework. Our motivation is that SViT is able to establish global semantic relations and model long-range context, which is more robust to noisy changes. In addition, different from the pure CNN-based models with high computational complexity, the proposed model is more efficient and has fewer parameters. Extensive experiments on the CDNet-2014 dataset demonstrate that the proposed TransCD (SViT-E1-D1-32) outperforms the state-of-the-art SCD models and achieves 0.9361 in terms of the F1 score with an improvement of 7.31%. (C) 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据