☆ 4.7 Article

Multi-Scale Spatial Attention-Guided Monocular Depth Estimation With Semantic Enhancement

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Volume 30, Issue -, Pages 8811-8822

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2021.3120670

Keywords

Estimation; Semantics; Mutual information; Feature extraction; Correlation; Cameras; Visualization; Depth estimation; multi-scale spatial attention-guided; mutual information; semantic enhancement

Funding

Natural Science Foundations of China [61771091, 61871066]
National High Technology Research and Development Program (863 Program) of China [2015AA016306]
Natural Science Foundation of Liaoning Province of China [20170540159]
Fundamental Research Fund for the Central Universities of China [DUT17LAB04]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study presents a monocular depth estimation method with multi-scale spatial attention guidance and semantic enhancement, which can focus more on small objects and improve the sharpness of depth prediction edges. Experimental results on public benchmark datasets demonstrate the effectiveness and superior performance of the proposed method.

Depth estimation from single monocular image is a vital but challenging task in 3D vision and scene understanding. Previous unsupervised methods have yielded impressive results, but the predicted depth maps still have several disadvantages such as missing small objects and object edge blurring. To address these problems, a multi-scale spatial attention guided monocular depth estimation method with semantic enhancement is proposed. Specifically, we first construct a multi-scale spatial attention-guided block based on atrous spatial pyramid pooling and spatial attention. Then, the correlation between the left and right views is fully explored by mutual information to obtain a more robust feature representation. Finally, we design a double-path prediction network to simultaneously generate depth maps and semantic labels. The proposed multi-scale spatial attention-guided block can focus more on the objects, especially on small objects. Moreover, the additional semantic information also enables the objects edge in the predicted depth maps more sharper. We conduct comprehensive evaluations on public benchmark datasets, such as KITTI and Make3D. The experiment results well demonstrate the effectiveness of the proposed method and achieve better performance than other self-supervised methods.

Multi-Scale Spatial Attention-Guided Monocular Depth Estimation With Semantic Enhancement

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multi-Scale Spatial Attention-Guided Monocular Depth Estimation With Semantic Enhancement

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper