4.7 Article

Edge supervision and multi-scale cost volume for stereo matching

Journal

IMAGE AND VISION COMPUTING
Volume 117, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.imavis.2021.104336

Keywords

Stereo matching; Geometric constraints; Multi-scale cost volume; Disparity refinement network

Funding

  1. Science and TechnologyMajor Project of Guizhou Province (Qiankehe Major Projects) [ZNWLQC [2019]3012]
  2. Science and Technology Project of Guizhou Province Department of Transportation [2021-322-021]
  3. Natural Science Foundation of Guangdong Province [2020A1515110501]
  4. Science and Technology Planning Project of Shenzhen [JCYJ20180503182133411]

Ask authors/readers for more resources

The research proposes RDNet, which incorporates edge cues into stereo matching and generates a depth ground-truth boundary dataset by mining instance segmentation and semantic segmentation datasets. Additionally, methods like multi-scale cost volume and disparity refinement network are introduced to further optimize stereo matching performance.
Recently, methods based on Convolutional Neural Network have achieved huge progress in stereo matching. However, it is still difficult to find accurate matching points in inherently ill-posed regions (e.g., weak texture areas and around object edges), in which the accuracy of disparity estimate can be improved by the corresponding geometric constraints. To tackle this problem, we innovatively generate the depth ground-truth boundary dataset by mining the instance segmentation and semantic segmentation datasets and propose RDNet, which incorporates edge cues into stereo matching. The network learns geometric information through a separate processing branch edge stream, which can process feature information in parallel with the stereo stream. The edge stream removes noise and only focuses on processing the relevant boundary information. Besides, we introduce a multi-scale cost volume in hierarchical cost aggregation to enlarge the receptive fields and capture structural and global representations that can significantly improve the ability of scene understanding and disparity estimation accuracy. Moreover, a disparity refinement network with several dilated convolutions is applied to further improve the accuracy of the final disparity estimation. The proposed method is evaluated on Sceneflow, KITTI 2015 and KITTI 2012 benchmark datasets, and the qualitative and quantitative results demonstrate that the proposed RDNet significantly achieves the state-of-the-art stereo matching performance. (c) 2021 Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available