4.7 Article

A CNN-Transformer Network With Multiscale Context Aggregation for Fine-Grained Cropland Change Detection

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSTARS.2022.3177235

关键词

Feature extraction; Transformers; Head; Data mining; Task analysis; Decoding; Biological system modeling; Change detection (CD); cropland; deep learning (DL); remote sensing; transformer

资金

  1. National Natural Science Foundation of China [61976234]
  2. Fundamental Research Funds for the Central Universities, Sun Yat-sen University [22qntd2001]

向作者/读者索取更多资源

Nonagriculturalization incidents are serious threats to local agricultural ecosystem and global food security. The proposed MSCANet combines the merits of CNN and transformer to fulfill efficient and effective cropland change detection. The article also provides a new cropland change detection dataset.
Nonagriculturalization incidents are serious threats to local agricultural ecosystem and global food security. Remote sensing change detection (CD) can provide an effective approach for in-time detection and prevention of such incidents. However, existing CD methods are difficult to deal with the large intraclass differences of cropland changes in high-resolution images. In addition, traditional CNN based models are plagued by the loss of long-range context information, and the high computational complexity brought by deep layers. Therefore, in this article, we propose a CNN-transformer network with multiscale context aggregation (MSCANet), which combines the merits of CNN and transformer to fulfill efficient and effective cropland CD. In the MSCANet, a CNN-based feature extractor is first utilized to capture hierarchical features, then a transformer-based MSCA is designed to encode and aggregate context information. Finally, a multibranch prediction head with three CNN classifiers is applied to obtain change maps, to enhance the supervision for deep layers. Besides, for the lack of CD dataset with fine-grained cropland change of interest, we also provide a new cropland change detection dataset, which contains 600 pairs of 512 x 512 bi-temporal images with the spatial resolution of 0.5-2m. Comparative experiments with several CD models prove the effectiveness of the MSCANet, with the highest F1 of 64.67% on the high-resolution semantic CD dataset, and of 71.29% on CLCD.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据