期刊
INFORMATION FUSION
卷 76, 期 -, 页码 46-54出版社
ELSEVIER
DOI: 10.1016/j.inffus.2021.05.002
关键词
Convolutional Graph Neural Network; Multi-modal fusion; Multi-Neighbourhood Graph Neural Network; Indoor scene classification; RGB-D
资金
- Secretary of Universities and Research of the Generalitat de Catalunya
- European Social Fund [FI2018, TEC2016-75976-R]
- Ministerio de Economia, Industria y Competitividad
- European Regional Development Fund (ERDF)
This paper presents a 2D-3D Fusion stage that combines 3D Geometric Features with 2D Texture Features to achieve a more robust geometric embedding, outperforming the current state-of-the-art in RGB-D indoor scene classification task based on experimental results using NYU-Depth-V2 and SUN RGB-D datasets.
Multi-modal fusion has been proved to help enhance the performance of scene classification tasks. This paper presents a 2D-3D Fusion stage that combines 3D Geometric Features with 2D Texture Features obtained by 2D Convolutional Neural Networks. To get a robust 3D Geometric embedding, a network that uses two novel layers is proposed. The first layer, Multi-Neighbourhood Graph Convolution, aims to learn a more robust geometric descriptor of the scene combining two different neighbourhoods: one in the Euclidean space and the other in the Feature space. The second proposed layer, Nearest Voxel Pooling, improves the performance of the well-known Voxel Pooling. Experimental results, using NYU-Depth-V2 and SUN RGB-D datasets, show that the proposed method outperforms the current state-of-the-art in RGB-D indoor scene classification task.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据