☆ 4.7 Article

A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (2023)

期刊

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING

卷 195, 期 -, 页码 446-461

出版社

ELSEVIER

DOI: 10.1016/j.isprsjprs.2022.12.012

关键词

Multi -view stereo; Optical satellite images; Deep learning; Dense matching; 3D reconstruction

类别

Geography, Physical Geosciences, Multidisciplinary Remote Sensing Imaging Science & Photographic Technology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this paper, a general deep learning framework named Sat-MVSF is proposed for three-dimensional (3D) reconstruction of the Earth's surface from multi-view optical satellite images. The framework includes pre-processing, a multi-view stereo network specifically designed for satellite imagery (Sat-MVSNet), and post-processing. The framework achieves state-of-the-art performance and robustness by incorporating deep feature extraction, rational polynomial camera warping, pyramid cost volume construction, regularization, regression, and a self-refinement strategy. Comparative experiments demonstrate the potential and superiority of the proposed framework over commercial and open-source methods. The author also emphasizes the need for more high-quality open-source training data to facilitate research in this field.

In this paper, we propose a general deep learning based framework, named Sat-MVSF, to perform threedimensional (3D) reconstruction of the Earth's surface from multi-view optical satellite images. The framework is a complete processing pipeline, including pre-processing, a multi-view stereo (MVS) network for satellite imagery (Sat-MVSNet), and post-processing. The pre-processing handles the geometric and radiometric configuration of the multi-view images and their cropping. The cropped multi-view patches are then fed into SatMVSNet, which includes deep feature extraction, rational polynomial camera (RPC) warping, pyramid cost volume construction, regularization, and regression, to obtain the height maps. The error matches are then filtered out and a digital surface model (DSM) is generated in the post-processing. Considering the complexity and diversity of real-world scenes, we also introduce a self-refinement strategy that does not require any groundtruth labels to enhance the performance and robustness of the Sat-MVSF framework. We comprehensively compare the proposed framework with popular commercial software and open-source methods, to demonstrate the potential of the proposed deep learning framework. On the WHU-TLC dataset, where the images are captured with a three-line camera (TLC), the proposed framework outperforms all the other solutions in terms of reconstruction fineness, and also outperforms most of the other methods in terms of efficiency. On the challenging MVS3D dataset, where the images are captured by the WorldView-3 satellite at different times and seasons, the proposed framework also exceeds the existing methods when using the model pretrained on aerial images and the introduced self-refinement strategy, demonstrating a high generalization ability. We also note that the lack of training samples hinders research in this field, and the availability of more high-quality open-source training data will greatly accelerate the research into deep learning based MVS satellite image reconstruction. The code will be available at https://gpcv.whu.edu.cn/data.

A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images

期刊

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images

期刊

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文