☆ 4.7 Article

Fusion-Based Correlation Learning Model for Cross-Modal Remote Sensing Image Retrieval

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS (2022)

Journal

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS

Volume 19, Issue -, Pages -

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/LGRS.2021.3131592

Keywords

Feature extraction; Semantics; Image retrieval; Fuses; Correlation; Buildings; Representation learning; Correlation learning; cross-modal retrieval; multimodal fusion; text-remote sensing (RS) image matching

Funding

National Natural Science Foundation of China [61790550, 61790554, 91538201]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

In this study, a fusion-based correlation learning model is proposed to address the heterogeneity gap in remote sensing image-text retrieval. By designing a cross-modal fusion network and utilizing knowledge distillation, this model improves the discriminative ability of feature representation and enhances the intermodality semantic consistency.

With the increasing of cross-modal data, cross-modal retrieval has attracted more attention in remote sensing (RS), since it provides a more flexible and convenient way to obtain interesting information than traditional retrieval. However, existing methods cannot fully exploit the semantic information, which only focuses on the semantic consistency, and ignore the information complementarity between different modalities. In this letter, to bridge the modality gap, we propose a novel fusion-based correlation learning model (FCLM) for image-text retrieval in RS. Specifically, a cross-modal-fusion network is designed to capture the intermodality complementary information and fused feature. The fused knowledge is furtherly transferred to supervise the learning of modality-specific network by knowledge distillation, which is helpful in improving the discriminative ability of feature representation and enhancing the intermodality semantic consistency to solve the heterogeneity gap problem. Finally, extensive experiments have been conducted on a public dataset and experimental results have shown that the FCLM method is effective in performing cross-modal retrieval and outperforms several baseline methods.

Fusion-Based Correlation Learning Model for Cross-Modal Remote Sensing Image Retrieval

Journal

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Fusion-Based Correlation Learning Model for Cross-Modal Remote Sensing Image Retrieval

Journal

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper