4.7 Article

Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 19, Issue 7, Pages 1510-1520

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2017.2666540

Keywords

Action recognition; sequential deep trajectory descriptor (sDTD); three-stream framework; long-term motion

Funding

  1. National Basic Research Program of China [2015CB351806]
  2. National Natural Science Foundation of China [61390515, U1611461, 61425025, 61471042]
  3. Beijing Municipal Commission of Science and Technology [Z151100000915070]
  4. Shenzhen Peacock Plan

Ask authors/readers for more resources

Learning the spatial-temporal representation of motion information is crucial to human action recognition. Nevertheless, most of the existing features or descriptors cannot capture motion information effectively, especially for long-term motion. To address this problem, this paper proposes a long-term motion descriptor called sequential deep trajectory descriptor (sDTD). Specifically, we project dense trajectories into two-dimensional planes, and subsequently a CNN-RNN network is employed to learn an effective representation for long-term motion. Unlike the popular two-stream ConvNets, the sDTD stream is introduced into a three-stream framework so as to identify actions from a video sequence. Consequently, this three-stream framework can simultaneously capture static spatial features, short-term motion, and long-term motion in the video. Extensive experiments were conducted on three challenging datasets: KTH, HMDB51, and UCF101. Experimental results show that our method achieves state-of-the-art performance on the KTH and UCF101 datasets, and is comparable to the state-of-the-art methods on the HMDB51 dataset.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available