☆ 4.7 Article

Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition

IEEE TRANSACTIONS ON IMAGE PROCESSING (2016)

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Volume 25, Issue 7, Pages 3010-3022

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2016.2552404

Keywords

Action recognition; hierarchical recurrent neural network; random scale & rotation transformations; skeleton

Funding

National Basic Research Program of China [2012CB316300]
National Science Foundation [1314484]
Strategic Priority Research Program within the Chinese Academy of Sciences [XDB02070100]
National Natural Science Foundation of China [61525306, 61420106015]
Division Of Computer and Network Systems
Direct For Computer & Info Scie & Enginr [1314484] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Motion characteristics of human actions can be represented by the position variation of skeleton joints. Traditional approaches generally extract the spatial-temporal representation of the skeleton sequences with well-designed hand-crafted features. In this paper, in order to recognize actions according to the relative motion between the limbs and the trunk, we propose an end-to-end hierarchical RNN for skeleton-based action recognition. We divide human skeleton into five main parts in terms of the human physical structure, and then feed them to five independent subnets for local feature extraction. After the following hierarchical feature fusion and extraction from local to global, dimensions of the final temporal dynamics representations are reduced to the same number of action categories in the corresponding data set through a single-layer perceptron. In addition, the output of the perceptron is temporally accumulated as the input of a softmax layer for classification. Random scale and rotation transformations are employed to improve the robustness during training. We compare with five other deep RNN variants derived from our model in order to verify the effectiveness of the proposed network. In addition, we compare with several other methods on motion capture and Kinect data sets. Furthermore, we evaluate the robustness of our model trained with random scale and rotation transformations for a multiview problem. Experimental results demonstrate that our model achieves the state-of-the-art performance with high computational efficiency.

Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper