☆ 4.6 Article

Monocular Depth Estimation With Multi-Scale Feature Fusion

IEEE SIGNAL PROCESSING LETTERS (2021)

Journal

IEEE SIGNAL PROCESSING LETTERS

Volume 28, Issue -, Pages 678-682

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/LSP.2021.3067498

Keywords

Convolution; Feature extraction; Estimation; Training; Fuses; Task analysis; Kernel; Attention; dense atrous spatial pyramid pooling; depth estimation; multi-scale feature fusion

Funding

Natural Science Foundations of China [61771091, 61871066]
National High Technology Research and Development Program (863 Program) of China [2015AA016306]
Natural Science Foundation of Liaoning Province of China [20170540159]
Fundamental Research Fund for the Central Universities of China [DUT17LAB04]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

In this paper, a monocular depth estimation method based on multi-scale feature fusion is proposed, which outperforms existing methods and achieves state-of-the-art results on several public benchmark datasets.

Depth estimation from a single image is a crucial but challenging task for reconstructing 3D structures and inferring scene geometry. However, most existing methods fail to extract more detailed information and estimate the distant small-scale objects well. In this paper, we propose a monocular depth estimation based on multi-scale feature fusion. Specifically, to obtain input features of different scales, we first feed the input images of different scales to pre-trained residual networks with sharing weights. Then, an attention mechanism is used to learn the salient features at different scales, which can integrate detailed information at large scale feature maps and scene information at small scale feature maps. Furthermore, inspired by the dense atrous spatial pyramid pooling in semantic segmentation, we build a multi-scale feature fusion dense pyramid to further improve the ability of the feature extraction. Last, a scale-invariant error loss is used to predict depth maps in log space. We evaluate our method on several public benchmark datasets (including NYU Depth V2 and KITTI). The experiment results show that the proposed method obtains better performance than the existing methods and achieves state-of-the-art results.

Monocular Depth Estimation With Multi-Scale Feature Fusion

Journal

IEEE SIGNAL PROCESSING LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Monocular Depth Estimation With Multi-Scale Feature Fusion

Journal

IEEE SIGNAL PROCESSING LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper