4.6 Article

Monocular Depth Estimation With Multi-Scale Feature Fusion

Journal

IEEE SIGNAL PROCESSING LETTERS
Volume 28, Issue -, Pages 678-682

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LSP.2021.3067498

Keywords

Convolution; Feature extraction; Estimation; Training; Fuses; Task analysis; Kernel; Attention; dense atrous spatial pyramid pooling; depth estimation; multi-scale feature fusion

Funding

  1. Natural Science Foundations of China [61771091, 61871066]
  2. National High Technology Research and Development Program (863 Program) of China [2015AA016306]
  3. Natural Science Foundation of Liaoning Province of China [20170540159]
  4. Fundamental Research Fund for the Central Universities of China [DUT17LAB04]

Ask authors/readers for more resources

In this paper, a monocular depth estimation method based on multi-scale feature fusion is proposed, which outperforms existing methods and achieves state-of-the-art results on several public benchmark datasets.
Depth estimation from a single image is a crucial but challenging task for reconstructing 3D structures and inferring scene geometry. However, most existing methods fail to extract more detailed information and estimate the distant small-scale objects well. In this paper, we propose a monocular depth estimation based on multi-scale feature fusion. Specifically, to obtain input features of different scales, we first feed the input images of different scales to pre-trained residual networks with sharing weights. Then, an attention mechanism is used to learn the salient features at different scales, which can integrate detailed information at large scale feature maps and scene information at small scale feature maps. Furthermore, inspired by the dense atrous spatial pyramid pooling in semantic segmentation, we build a multi-scale feature fusion dense pyramid to further improve the ability of the feature extraction. Last, a scale-invariant error loss is used to predict depth maps in log space. We evaluate our method on several public benchmark datasets (including NYU Depth V2 and KITTI). The experiment results show that the proposed method obtains better performance than the existing methods and achieves state-of-the-art results.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available