☆ 4.4 Article

Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition

IET COMPUTER VISION (2022)

期刊

IET COMPUTER VISION

卷 16, 期 2, 页码 143-158

出版社

WILEY

DOI: 10.1049/cvi2.12075

关键词

computer graphics; computer vision; convolutional neural nets; graphics processing units; space-time adaptive processing

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

Key-Area Research and Development Program of Guangdong Province [2020B1111010002, B020214001, B010109001]
Guangzhou Industrial Technology Major Research Plan [2019-0101-12-1006-0001]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study introduces a novel multi-stream adaptive spatial-temporal attention GCN model that addresses issues found in existing GCN models, enhancing network performance through the use of a learnable topology graph and spatial-temporal attention module.

Skeleton-based action recognition algorithms have been widely applied to human action recognition. Graph convolutional networks (GCNs) generalize convolutional neural networks (CNNs) to non-Euclidean graphs and achieve significant performance in skeleton-based action recognition. However, existing GCN-based models have several issues, such as the topology of the graph is defined based on the natural skeleton of the human body, which is fixed during training, and it may not be applied to different layers of the GCN model and diverse datasets. Besides, the higher-order information of the joint data, for example, skeleton and dynamic information is not fully utilised. This work proposes a novel multi-stream adaptive spatial-temporal attention GCN model that overcomes the aforementioned issues. The method designs a learnable topology graph to adaptively adjust the connection relationship and strength, which is updated with training along with other network parameters. Simultaneously, the adaptive connection parameters are utilised to optimise the connection of the natural skeleton graph and the adaptive topology graph. The spatial-temporal attention module is embedded in each graph convolution layer to ensure that the network focuses on the more critical joints and frames. A multi-stream framework is built to integrate multiple inputs, which further improves the performance of the network. The final network achieves state-of-the-art performance on both the NTU-RGBD and Kinetics-Skeleton action recognition datasets. The simulation results prove that the proposed method reveals better results than existing methods in all perspectives and that shows the superiority of the proposed method.

Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition

期刊

IET COMPUTER VISION

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition

期刊

IET COMPUTER VISION

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文