4.6 Article

Human Action Recognition Based on Skeleton Information and Multi-Feature Fusion

Journal

ELECTRONICS
Volume 12, Issue 17, Pages -

Publisher

MDPI
DOI: 10.3390/electronics12173702

Keywords

motion recognition; backbone network; motion evaluation

Ask authors/readers for more resources

This paper addresses key challenges in human action recognition and assessment by proposing innovative methods, including using Oct-MobileNet to improve accuracy and combining skeleton-based information and multiple feature fusion for action recognition. It also introduces a multimodal information-based assessment method to help exercisers evaluate their exercise performance.
Action assessment and feedback can effectively assist fitness practitioners in improving exercise benefits. In this paper, we address key challenges in human action recognition and assessment by proposing innovative methods that enhance performance while reducing computational complexity. Firstly, we present Oct-MobileNet, a lightweight backbone network, to overcome the limitations of the traditional OpenPose algorithm's VGG19 network, which exhibits a large parameter size and high device requirements. Oct-MobileNet employs octave convolution and attention mechanisms to improve the extraction of high-frequency features from the human body contour, resulting in enhanced accuracy with reduced model computational burden. Furthermore, we introduce a novel approach for action recognition that combines skeleton-based information and multiple feature fusion. By extracting spatial geometric and temporal characteristics from actions, we employ a sliding window algorithm to integrate these features. Experimental results show the effectiveness of our approach, demonstrating its ability to accurately recognize and classify various human actions. Additionally, we address the evaluation of traditional fitness exercises, specifically focusing on the BaDunJin movements. We propose a multimodal information-based assessment method that combines pose detection and keypoint analysis. Label sequences are obtained through a pose detector and each frame's keypoint coordinates are represented as pose vectors. Leveraging multimodal information, including label sequences and pose vectors, we explore action similarity and perform quantitative evaluations to help exercisers assess the quality of their exercise performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available