☆ 4.7 Article

CMGFNet: A deep cross-modal gated fusion network for building extraction from very high-resolution remote sensing images

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (2022)

期刊

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING

卷 184, 期 -, 页码 96-115

出版社

ELSEVIER

DOI: 10.1016/j.isprsjprs.2021.12.007

关键词

Building extraction; VHR remote sensing image; Digital surface model; Gated fusion module; Cross-modal

类别

Geography, Physical Geosciences, Multidisciplinary Remote Sensing Imaging Science & Photographic Technology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This research proposes a cross-modal gated fusion network (CMGFNet) for extracting building footprints from high-resolution remote sensing images and DSMs data. CMGFNet utilizes separate encoders to extract features from RGB and DSM data and employs cross-modal and multi-level feature fusion methods. Experimental results demonstrate that CMGFNet outperforms other state-of-the-art models, and extensive ablation study confirms the efficacy of all significant elements.

The extraction of urban structures such as buildings from very high-resolution (VHR) remote sensing imagery has improved dramatically, thanks to recent developments in deep multimodal fusion models. However, Due to the variety of colour intensities with complex textures of building objects in VHR images and the low quality of the digital surface model (DSM), it is challenging to develop the optimal cross-modal fusion network that takes advantage of these two modalities. This research presents an end-to-end cross-modal gated fusion network (CMGFNet) for extracting building footprints from VHR remote sensing images and DSMs data. The CMGFNet extracts multi-level features from RGB and DSM data by using two separate encoders. We offer two methods for fusing features in two modalities: Cross-modal and multi-level feature fusion. For cross-modal feature fusion, a gated fusion module (GFM) is proposed to combine two modalities efficiently. The multi-level feature fusion fuses the high-level features from deep layers with shallower low-level features through a top-down strategy. Furthermore, a residual-like depth-wise separable convolution (R-DSC) is introduced to enhance the performance of the up-sampling process and decrease the parameters and time complexity in the decoder section. Experimental results from challenging datasets show that the CMGFNet outperforms other state-of-the-art models. The efficacy of all significant elements is also confirmed by the extensive ablation study.

CMGFNet: A deep cross-modal gated fusion network for building extraction from very high-resolution remote sensing images

期刊

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

CMGFNet: A deep cross-modal gated fusion network for building extraction from very high-resolution remote sensing images

期刊

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文