Journal
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Volume 39, Issue 4, Pages 719-731Publisher
IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2016.2574713
Keywords
Object reconstruction; 3D shape modeling; viewpoint estimation; scene understanding
Funding
- NSF [IIS-1212798]
- ONR [MURI-N00014-10-1-0933]
- Berkeley fellowship
- Portuguese Science Foundation, FCT [SFRH/BPD/84194/2012]
- Fundação para a Ciência e a Tecnologia [SFRH/BPD/84194/2012] Funding Source: FCT
Ask authors/readers for more resources
We address the problem of fully automatic object localization and reconstruction from a single image. This is both a very challenging and very important problem which has, until recently, received limited attention due to difficulties in segmenting objects and predicting their poses. Here we leverage recent advances in learning convolutional networks for object detection and segmentation and introduce a complementary network for the task of camera viewpoint prediction. These predictors are very powerful, but still not perfect given the stringent requirements of shape reconstruction. Our main contribution is a new class of deformable 3D models that can be robustly fitted to images based on noisy pose and silhouette estimates computed upstream and that can be learned directly from 2D annotations available in object detection datasets. Our models capture top-down information about the main global modes of shape variation within a class providing a low-frequency shape. In order to capture fine instance-specific shape details, we fuse it with a high-frequency component recovered from shading cues. A comprehensive quantitative analysis and ablation study on the PASCAL 3D+ dataset validates the approach as we show fully automatic reconstructions on PASCAL VOC as well as large improvements on the task of viewpoint prediction.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available