☆ 4.6 Article

Multi-View Stereo Network Based on Attention Mechanism and Neural Volume Rendering

ELECTRONICS (2023)

期刊

ELECTRONICS

卷 12, 期 22, 页码 -

出版社

MDPI

DOI: 10.3390/electronics12224603

关键词

multi-view stereo; 3D reconstruction; attention mechanism; neural volume rendering

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Physics, Applied

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper introduces a learning-based multi-view stereo (MVS) algorithm based on attention mechanism and neural volume rendering. The algorithm addresses the issue of incorrect feature matching and incomplete scene reconstruction by employing multi-scale feature extraction and neural volume rendering. Experimental results demonstrate superior performance and generalization capability of the proposed algorithm.

Due to the presence of regions with weak textures or non-Lambertian surfaces, feature matching in learning-based Multi-View Stereo (MVS) algorithms often leads to incorrect matches, resulting in the construction of the flawed cost volume and incomplete scene reconstruction. In response to this limitation, this paper introduces the MVS network based on attention mechanism and neural volume rendering. Firstly, we employ a multi-scale feature extraction module based on dilated convolution and attention mechanism. This module enables the network to accurately model inter-pixel dependencies, focusing on crucial information for robust feature matching. Secondly, to mitigate the impact of the flawed cost volume, we establish a neural volume rendering network based on multi-view semantic features and neural encoding volume. By introducing the rendering reference view loss, we infer 3D geometric scenes, enabling the network to learn scene geometry information beyond the cost volume representation. Additionally, we apply the depth consistency loss to maintain geometric consistency across networks. The experimental results indicate that on the DTU dataset, compared to the CasMVSNet method, the completeness of reconstructions improved by 23.1%, and the Overall increased by 7.3%. On the intermediate subset of the Tanks and Temples dataset, the average F-score for reconstructions is 58.00, which outperforms other networks, demonstrating superior reconstruction performance and strong generalization capability.

Multi-View Stereo Network Based on Attention Mechanism and Neural Volume Rendering

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-View Stereo Network Based on Attention Mechanism and Neural Volume Rendering

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文