Journal
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Volume 28, Issue 11, Pages 3174-3182Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2017.2740321
Keywords
Classification; deep residual networks; depth estimation
Categories
Ask authors/readers for more resources
Depth estimation from single monocular images is a key component in scene understanding. Most existing algorithms formulate depth estimation as a regression problem due to the continuous property of depths. However, the depth value of input data can hardly be regressed exactly to the ground-truth value. In this paper, we propose to formulate depth estimation as a pixelwise classification task. Specifically, we first discretize the continuous ground-truth depths into several bins and label the bins according to their depth ranges. Then, we solve the depth estimation problem as classification by training a fully convolutional deep residual network. Compared with estimating the exact depth of a single point, it is easier to estimate its depth range. More importantly, by performing depth classification instead of regression, we can easily obtain the confidence of a depth prediction in the form of probability distribution. With this confidence, we can apply an information gain loss to make use of the predictions that are close to ground-truth during training, as well as fully-connected conditional random fields for post-processing to further improve the performance. We test our proposed method on both indoor and outdoor benchmark RGB-Depth datasets and achieve state-of-the-art performance.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available