☆ 3.8 Proceedings Paper

CAPFORMER: PURE TRANSFORMER FOR REMOTE SENSING IMAGE CAPTION

2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022) (2022)

Journal

2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022)

Volume -, Issue -, Pages 7996-7999

Publisher

IEEE

DOI: 10.1109/IGARSS46834.2022.9883199

Keywords

Remote sensing image caption; Transformer

Funding

National Natural Science Foundation of China [42071350, 42171336]
LIESMARS Special Research Funding

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a pure Transformer (CapFormer) architecture for accurately describing high-spatial resolution remote sensing images. By adopting a scalable vision Transformer and a Transformer decoder, CapFormer outperforms the state-of-the-art image caption methods in summarizing complex scenes.

Accurately describing high-spatial resolution remote sensing images requires the understanding the inner attributes of the objects and the outer relations between different objects. The existing image caption algorithms lack the ability of global representation, which are not fit for the summarization of complex scenes. To this end, we propose a pure transformer (CapFormer) architecture for remote sensing image caption. Specifically, a scalable vision transformer is adopted for image representation, where the global content can be captured with multi-head self-attention layers. A transformer decoder is designed to successively translate the image features into comprehensive sentences. The transformer decoder explicitly model the historical words and interact with the image features using cross-attention layers. The comprehensive and ablation experiments on RSICD dataset demonstrate that the CapFormer outperforms the state-of-the-art image caption methods.

CAPFORMER: PURE TRANSFORMER FOR REMOTE SENSING IMAGE CAPTION

Journal

2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022)

Publisher

IEEE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

CAPFORMER: PURE TRANSFORMER FOR REMOTE SENSING IMAGE CAPTION

Journal

2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022)

Publisher

IEEE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper