4.7 Article

A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images

期刊

出版社

ELSEVIER
DOI: 10.1016/j.isprsjprs.2022.12.012

关键词

Multi -view stereo; Optical satellite images; Deep learning; Dense matching; 3D reconstruction

向作者/读者索取更多资源

In this paper, a general deep learning framework named Sat-MVSF is proposed for three-dimensional (3D) reconstruction of the Earth's surface from multi-view optical satellite images. The framework includes pre-processing, a multi-view stereo network specifically designed for satellite imagery (Sat-MVSNet), and post-processing. The framework achieves state-of-the-art performance and robustness by incorporating deep feature extraction, rational polynomial camera warping, pyramid cost volume construction, regularization, regression, and a self-refinement strategy. Comparative experiments demonstrate the potential and superiority of the proposed framework over commercial and open-source methods. The author also emphasizes the need for more high-quality open-source training data to facilitate research in this field.
In this paper, we propose a general deep learning based framework, named Sat-MVSF, to perform threedimensional (3D) reconstruction of the Earth's surface from multi-view optical satellite images. The framework is a complete processing pipeline, including pre-processing, a multi-view stereo (MVS) network for satellite imagery (Sat-MVSNet), and post-processing. The pre-processing handles the geometric and radiometric configuration of the multi-view images and their cropping. The cropped multi-view patches are then fed into SatMVSNet, which includes deep feature extraction, rational polynomial camera (RPC) warping, pyramid cost volume construction, regularization, and regression, to obtain the height maps. The error matches are then filtered out and a digital surface model (DSM) is generated in the post-processing. Considering the complexity and diversity of real-world scenes, we also introduce a self-refinement strategy that does not require any groundtruth labels to enhance the performance and robustness of the Sat-MVSF framework. We comprehensively compare the proposed framework with popular commercial software and open-source methods, to demonstrate the potential of the proposed deep learning framework. On the WHU-TLC dataset, where the images are captured with a three-line camera (TLC), the proposed framework outperforms all the other solutions in terms of reconstruction fineness, and also outperforms most of the other methods in terms of efficiency. On the challenging MVS3D dataset, where the images are captured by the WorldView-3 satellite at different times and seasons, the proposed framework also exceeds the existing methods when using the model pretrained on aerial images and the introduced self-refinement strategy, demonstrating a high generalization ability. We also note that the lack of training samples hinders research in this field, and the availability of more high-quality open-source training data will greatly accelerate the research into deep learning based MVS satellite image reconstruction. The code will be available at https://gpcv.whu.edu.cn/data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据