4.6 Article

Human Action Recognition Based on Skeleton Information and Multi-Feature Fusion

期刊

ELECTRONICS
卷 12, 期 17, 页码 -

出版社

MDPI
DOI: 10.3390/electronics12173702

关键词

motion recognition; backbone network; motion evaluation

向作者/读者索取更多资源

This paper addresses key challenges in human action recognition and assessment by proposing innovative methods, including using Oct-MobileNet to improve accuracy and combining skeleton-based information and multiple feature fusion for action recognition. It also introduces a multimodal information-based assessment method to help exercisers evaluate their exercise performance.
Action assessment and feedback can effectively assist fitness practitioners in improving exercise benefits. In this paper, we address key challenges in human action recognition and assessment by proposing innovative methods that enhance performance while reducing computational complexity. Firstly, we present Oct-MobileNet, a lightweight backbone network, to overcome the limitations of the traditional OpenPose algorithm's VGG19 network, which exhibits a large parameter size and high device requirements. Oct-MobileNet employs octave convolution and attention mechanisms to improve the extraction of high-frequency features from the human body contour, resulting in enhanced accuracy with reduced model computational burden. Furthermore, we introduce a novel approach for action recognition that combines skeleton-based information and multiple feature fusion. By extracting spatial geometric and temporal characteristics from actions, we employ a sliding window algorithm to integrate these features. Experimental results show the effectiveness of our approach, demonstrating its ability to accurately recognize and classify various human actions. Additionally, we address the evaluation of traditional fitness exercises, specifically focusing on the BaDunJin movements. We propose a multimodal information-based assessment method that combines pose detection and keypoint analysis. Label sequences are obtained through a pose detector and each frame's keypoint coordinates are represented as pose vectors. Leveraging multimodal information, including label sequences and pose vectors, we explore action similarity and perform quantitative evaluations to help exercisers assess the quality of their exercise performance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据