☆ 4.7 Article

CloudViT: A Lightweight Vision Transformer Network for Remote Sensing Cloud Detection

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS (2023)

Journal

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS

Volume 20, Issue -, Pages -

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/LGRS.2022.3233122

Keywords

Cloud computing; Remote sensing; Feature extraction; Earth; Artificial satellites; Satellites; Transformers; Attention mechanism; cloud detection; deep learning; remote sensing image; vision transformer (ViT)

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

In this study, a lightweight vision transformer network called CloudViT is proposed for cloud detection from satellite imagery. By using dark channel priors in multispectral imagery to guide the network to learn features, the network is able to enhance image features and produce more accurate cloud detection results. Additionally, a plug-and-play channel adaptive module is introduced to address the inconsistency in the number of bands from different satellite sensors.

Clouds inevitably exist in satellite images, which limit the processing and application of satellite images to a certain extent. Therefore, cloud detection is a preprocessing task in satellite image extraction and analysis processing. However, the existing methods are difficult to mine robust features, and the number of parameters and computation are large, which is not conducive to the deployment of the model. In this letter, cloud vision transformer (CloudViT), a lightweight vision transformer network for cloud detection from satellite imagery, is proposed. In detail, to utilize dark channel priors in multispectral imagery to guide the network to learn features, a multiscale dark channel extractor is used to first predict dark channels, and then, the dark channel features and image features are input to the attention mechanism-based dark channel-guided context aggregation module to enhance image features, which in turn makes cloud detection results more accurate. At the same time, to enhance the transfer ability of the network between different satellite sensors, a plug-and-play channel adaptive module is proposed to deal with the inconsistency of the number of different satellite sensor bands. The experimental results on the Landsat7 dataset show that our network CloudViT outperforms the state-of-the-art methods while keeping the number of parameters and computation small. At the same time, the experimental results on transfer to three other datasets show that using the channel adaptation module can greatly improve the transfer ability of the model.

CloudViT: A Lightweight Vision Transformer Network for Remote Sensing Cloud Detection

Journal

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

CloudViT: A Lightweight Vision Transformer Network for Remote Sensing Cloud Detection

Journal

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper