4.7 Article

MSFusion: Multistage for Remote Sensing Image Spatiotemporal Fusion Based on Texture Transformer and Convolutional Neural Network

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSTARS.2022.3179415

关键词

Remote sensing; Feature extraction; Predictive models; Transformers; Spatial resolution; Fuses; Spatiotemporal phenomena; Multistage feature fusion; multitemporal remote sensing data; remote sensing; self-attention; spatiotemporal fusion; transformer

资金

  1. National Natural Science Foundation of China [61966035]
  2. National Science Foundation of China [U1803261]
  3. Natural Science Foundation of the Xin Jiang Uygur Autonomous Region [2021D01C077]
  4. Autonomous Region Graduate Innovation Project [XJ2019G069, XJ2021G062, XJ2020G074]

向作者/读者索取更多资源

Due to limitations of current technology, a single satellite sensor cannot capture high spatiotemporal resolution remote sensing images. To address this, a multistage remote sensing image spatio-temporal fusion model based on texture transformer and convolutional neural network is proposed, which can fuse features of different scales and achieve excellent results.
Due to the limitations of current technology and budget, a single satellite sensor cannot obtain high spatiotemporal resolution remote sensing images. Therefore, remote sensing image spatio-temporal fusion technology is considered as an effective solution and has attracted extensive attention. In the field of deep learning, due to the fixed size of the perception field of a convolutional neural network, it is impossible to model the correlation of global features, and the features extracted only through convolution operation lack the ability to capture long-distance features. At the same time, complex fusion methods cannot better integrate temporal and spatial features. In order to solve these problems, we propose a multistage remote sensing image spatio-temporal fusion model based on texture transformer and convolutional neural network. The model combines the advantages of transformer and convolutional network, uses a lightweight convolution network to extract spatial features and temporal discrepancy features, uses transformer to learn global temporal correlation, and finally, fuses temporal features with spatial features. In order to make full use of the features obtained in different stages, we design a cross-stage adaptive fusion module CSAFM. The module adopts the self-attention mechanism to adaptively integrate the features of different scales while considering the temporal and spatial characteristics. To test the robustness of the model, the experiments are carried out on three datasets of CIA, LGC, and DX. Compared with five typical spatio-temporal fusion algorithms, we obtain excellent results, which prove the superiority of MSFusion model.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据