4.7 Article

Global and Local Contrastive Self-Supervised Learning for Semantic Segmentation of HR Remote Sensing Images

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2022.3147513

关键词

Semantics; Remote sensing; Task analysis; Image segmentation; Supervised learning; Force; Feature extraction; Contrastive learning; remote sensing image (RSI) semantic segmentation; self-supervised learning (SSL)

资金

  1. National Natural Science Foundation of China [41871364, 41861048, 42171376]
  2. Fundamental Research Funds for the Central Universities of Central South University [2021zzts0842]

向作者/读者索取更多资源

This study proposes a global style and local matching contrastive learning network (GLCNet) for remote sensing image (RSI) semantic segmentation. By using global style contrastive learning and local feature matching contrastive learning modules, the method achieves superior results compared to state-of-the-art methods and supervised learning methods on various RSI semantic segmentation datasets.
Recently, supervised deep learning has achieved a great success in remote sensing image (RSI) semantic segmentation. However, supervised learning for semantic segmentation requires a large number of labeled samples, which is difficult to obtain in the field of remote sensing. A new learning paradigm, self-supervised learning (SSL), can be used to solve such problems by pretraining a general model with a large number of unlabeled images and then fine-tuning it on a downstream task with very few labeled samples. Contrastive learning is a typical method of SSL that can learn general invariant features. However, most existing contrastive learning methods are designed for classification tasks to obtain an image-level representation, which may be suboptimal for semantic segmentation tasks requiring pixel-level discrimination. Therefore, we propose a global style and local matching contrastive learning network (GLCNet) for RSI semantic segmentation. Specifically, first, the global style contrastive learning module is used to better learn an image-level representation, as we consider that style features can better represent the overall image features. Next, the local features matching the contrastive learning module is designed to learn the representations of local regions, which is beneficial for semantic segmentation. We evaluate four RSI semantic segmentation datasets, and the experimental results show that our method mostly outperforms the state-of-the-art self-supervised methods and the ImageNet pretraining method. Specifically, with 1% annotation from the original dataset, our approach improves Kappa by 6% on the International Society for Photogrammetry and Remote Sensing (ISPRS) Potsdam dataset relative to the existing baseline. Moreover, our method outperforms supervised learning methods when there are some differences between the datasets of upstream tasks and downstream tasks. Our study promotes the development of SSL in the field of RSI semantic segmentation. Since SSL could directly learn the essential characteristics of data from unlabeled data, which is easy to obtain in the remote sensing field, this may be of great significance for tasks such as global mapping. The source code is available at https://github.com/GeoX-Lab/G-RSIM.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据