Journal
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Volume 61, Issue -, Pages -Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2023.3234694
Keywords
Deep learning; depth map-based stereo recon-struction; digital surface model (DSM); multiview stereo (MVS) reconstruction
Ask authors/readers for more resources
This article introduces a new benchmark dataset called the LuoJia-MVS dataset and a new deep neural network called the HDC-MVSNet. The LuoJia-MVS dataset contains 7972 five-view images with a spatial resolution of 10 cm, pixel-wise depths, and precise camera parameters. The HDC-MVSNet network is designed with a new full-scale feature pyramid extraction module, a hierarchical set of 3-D convolutional blocks, and true 3-D deformable convolutional layers by considering the deformation problem and scale variation issue of aerial images.
Multiview stereo (MVS) aerial image depth estimation is a research frontier in the remote sensing field. Recent deep learning-based advances in close-range object reconstruction have suggested the great potential of this approach. Meanwhile, the deformation problem and the scale variation issue are also worthy of attention. These characteristics of aerial images limit the applicability of the current methods for aerial image depth estimation. Moreover, there are few available benchmark datasets for aerial image depth estimation. In this regard, this article describes a new benchmark dataset called the LuoJia-MVS dataset (https://irsip.whu.edu.cn/resources/resources_en_v2.php), as well as a new deep neural network known as the hierarchical deformable cascade MVS network (HDC-MVSNet). The LuoJia-MVS dataset contains 7972 five-view images with a spatial resolution of 10 cm, pixel-wise depths, and precise camera parameters, and was generated from an accurate digital surface model (DSM) built from thousands of stereo aerial images. In the HDC-MVSNet network, a new full-scale feature pyramid extraction module, a hierarchical set of 3-D convolutional blocks, and true 3-D deformable 3-D convolutional layers are specifically designed by considering the aforementioned characteristics of aerial images. Overall and ablation experiments on the WHU and LuoJia-MVS datasets validated the superiority of HDC-MVSNet over the current state-of-the-art MVS depth estimation methods and confirmed that the newly built dataset can provide an effective benchmark.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available