☆ 4.7 Article

MLDA-Net: Multi-Level Dual Attention-Based Network for Self-Supervised Monocular Depth Estimation

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Volume 30, Issue -, Pages 4691-4705

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2021.3074306

Keywords

Estimation; Training; Feature extraction; Cameras; Benchmark testing; Sensors; Image sensors; Depth estimation; self-supervised; dual-attention; feature fusion

Funding

Baidu Research
National Key Research and Development Program of China [2018AAA0102803]
Natural Science Foundation of China [61871325, 61671387]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a novel MLDA-Net framework for self-supervised depth estimation, improving the quality of depth maps through multi-level feature extraction and dual-attention strategy.

The success of supervised learning-based single image depth estimation methods critically depends on the availability of large-scale dense per-pixel depth annotations, which requires both laborious and expensive annotation process. Therefore, the self-supervised methods are much desirable, which attract significant attention recently. However, depth maps predicted by existing self-supervised methods tend to be blurry with many depth details lost. To overcome these limitations, we propose a novel framework, named MLDA-Net, to obtain per-pixel depth maps with shaper boundaries and richer depth details. Our first innovation is a multi-level feature extraction (MLFE) strategy which can learn rich hierarchical representation. Then, a dual-attention strategy, combining global attention and structure attention, is proposed to intensify the obtained features both globally and locally, resulting in improved depth maps with sharper boundaries. Finally, a reweighted loss strategy based on multi-level outputs is proposed to conduct effective supervision for self-supervised depth estimation. Experimental results demonstrate that our MLDA-Net framework achieves state-of-the-art depth prediction results on the KITTI benchmark for self-supervised monocular depth estimation with different input modes and training modes. Extensive experiments on other benchmark datasets further confirm the superiority of our proposed approach.

MLDA-Net: Multi-Level Dual Attention-Based Network for Self-Supervised Monocular Depth Estimation

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

MLDA-Net: Multi-Level Dual Attention-Based Network for Self-Supervised Monocular Depth Estimation

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper