4.8 Article

View-Aware Geometry-Structure Joint Learning for Single-View 3D Shape Reconstruction

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2021.3090917

关键词

Three-dimensional displays; Shape; Image reconstruction; Geometry; Periodic structures; Solid modeling; Topology; Single-view 3D reconstruction; structure-aware reconstruction; multimodal learning; representation learning

资金

  1. National Key R&D Program of China [2018YFB1703404]
  2. National Natural Science Funds of China [U1701262, U1801263]

向作者/读者索取更多资源

Reconstructing 3D shape from a single-view image using deep learning has gained popularity, but existing methods suffer from the lack of explicit structure modeling and loss of view information. In this paper, we propose VGSNet, an encoder-decoder architecture that jointly learns the feature representation of 2D image and 3D shape to achieve geometry and structure reconstruction from a single-view image.
Reconstructing a 3D shape from a single-view image using deep learning has become increasingly popular recently. Most existing methods only focus on reconstructing the 3D shape geometry based on image constraints. The lack of explicit modeling of structure relations among shape parts yields low-quality reconstruction results for structure-rich man-made shapes. In addition, conventional 2D-3D joint embedding architecture for image-based 3D shape reconstruction often omits the specific view information from the given image, which may lead to degraded geometry and structure reconstruction. We address these problems by introducing VGSNet, an encoder-decoder architecture for view-aware joint geometry and structure learning. The key idea is to jointly learn a multimodal feature representation of 2D image, 3D shape geometry and structure so that both geometry and structure details can be reconstructed from a single-view image. To this end, we explicitly represent 3D shape structures as part relations and employ image supervision to guide the geometry and structure reconstruction. Trained with pairs of view-aligned images and 3D shapes, the VGSNet implicitly encodes the view-aware shape information in the latent feature space. Qualitative and quantitative comparisons with the state-of-the-art baseline methods as well as ablation studies demonstrate the effectiveness of the VGSNet for structure-aware single-view 3D shape reconstruction.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据