4.7 Article

MODE: Monocular omnidirectional depth estimation via consistent depth fusion

期刊

IMAGE AND VISION COMPUTING
卷 136, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.imavis.2023.104723

关键词

Omnidirectional depth estimation; Depth initialization; Long-range dependency

向作者/读者索取更多资源

Monocular depth estimation has made significant progress in recent years, but the results in omnidirectional images are not satisfying. In this paper, we propose a novel network-MODE that addresses the challenges in omnidirectional depth estimation and improves performance through flexible modules. The proposed method is validated on widely used datasets and shown to be effective through an ablation study on real-world datasets.
Monocular depth estimation has seen significant progress in recent years, especially in outdoor scenes. However, depth estimation results are not satisfying in omnidirectional images. As compared to perspective images, esti-mating the depth map from an omnidirectional image captured in the outdoor scene, using neural networks, has two additional challenges: (i) the depth range of outdoor images varies a lot across different scenes, making it difficult for the depth network to predict accurate depth results for training with an indoor dataset, besides the maximum distance in outdoor scenes mostly stays the same as the camera sees the sky, but depth labels in this region are entirely missing in existing datasets; (ii) a standard representation of omnidirectional images intro-duces spherical distortion, which causes difficulties for the vanilla network to predict accurate relative structural depth details. In this paper, we propose a novel network-MODE by giving special considerations to those challenges and designing a set of flexible modules for improving the performance of omnidirectional depth esti-mation. First, a consistent depth structure module is proposed to estimate a consistent depth structure map, and the predicted structural map can improve depth details. Second, to suit the characteristics of spherical sampling, we propose a strip convolution fusion module to enhance long-range dependencies. Third, rather than using a single depth decoder branch as in previous methods, we propose a semantics decoder branch to estimate sky re-gions in the omnidirectional image. The proposed method is validated on three widely used datasets, demon-strating the state-of-the-art performance. Moreover, the effectiveness of each module is shown through an ablation study on real-world datasets. Our code is available at https://github.com/lkku1/MODE.& COPY; 2017 Elsevier Inc. All rights reserved. & COPY; 2023 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据