3.8 Proceedings Paper

MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/CVPR52688.2022.00838

Keywords

-

Ask authors/readers for more resources

MVS2D is a highly efficient multi-view stereo algorithm that seamlessly integrates multi-view constraints into single-view networks via an attention mechanism. It is at least 2 times faster than all notable counterparts and achieves precise depth estimations and 3D reconstructions, achieving state-of-the-art results.
Deep learning has made significant impacts on multiview stereo systems. State-of-the-art approaches typically involve building a cost volume, followed by multiple 3D convolution operations to recover the input image's pixelwise depth. While such end-to-end learning of plane-sweeping stereo advances public benchmarks' accuracy, they are typically very slow to compute. We present MVS2D, a highly efficient multi-view stereo algorithm that seamlessly integrates multi-view constraints into single-view networks via an attention mechanism. Since MVS2D only builds on 2D convolutions, it is at least 2 x faster than all the notable counterparts. Moreover, our algorithm produces precise depth estimations and 3D reconstructions, achieving state-of-the-art results on challenging benchmarks ScanNet, SUN3D, RGBD, and the classical DTU dataset. our algorithm also out-performs all other algorithms in the setting of inexact camera poses. Our code is released at https://github.com/zhenpeiyang/MVS2D

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available