4.7 Article

Fixing Defect of Photometric Loss for Self-Supervised Monocular Depth Estimation

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2021.3068834

关键词

Cameras; Geometry; Three-dimensional displays; Estimation; Optical variables control; Optical imaging; Deep learning; Photometric consistency; 3D reconstruction; epipolar geometry

资金

  1. Natural Science Foundation of Hunan Province [2017JJ2252]
  2. Education Department of Hunan Province [16B258]
  3. National Key Research and Development Program of China [2018AAA0102102]

向作者/读者索取更多资源

View-synthesis-based methods have achieved promising results in unsupervised depth estimation. However, the ambiguity in pose and depth combinations and the violation of the photometric consistency assumption pose challenges. To address these issues, this study proposed using point cloud consistency constraint to eliminate ambiguity, threshold masks to filter dynamic and occluded points, and matching point and epipolar constraints to improve the accuracy and robustness of depth prediction.
View-synthesis-based methods have shown very promising results for the task of unsupervised depth estimation in single images. Most existing approaches synthesize a new image and employ it as the supervision signal for depth and pose prediction. There are two problems in these approaches: 1) There are many combinations of pose and depth that can synthesize a certain new image; therefore, reconstructing the depth and pose based on the view-synthesis method from only two images is an inherently ill-posed problem; 2) The model is trained under the photometric consistency assumption that the brightness or gradient is constant when applied to the video sequences. However, this assumption is easily violated in realistic scenes due to light changes, reflective surfaces and occlusions. To overcome the first drawback, we exploit the point cloud consistency constraint to eliminate ambiguity. To overcome the second drawback, we use threshold masks to filter dynamic and occluded points and introduce matching point constraints that implicitly encode the geometry relationship between two matched points to improve the precision of depth prediction. In addition, we employ epipolar constraints to compensate for the instability of the photometric error in textureless regions and varying illumination conditions. The experimental results on the KITTI, Cityscapes and NYUv2 datasets show that the method can improve the accuracy of depth prediction and enhance the robustness of the model in handling textureless regions and illumination changes. The code and data are available at https://github.com/XTUPRLAB/FixUnDepth.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据