期刊
PATTERN RECOGNITION LETTERS
卷 50, 期 -, 页码 159-169出版社
ELSEVIER
DOI: 10.1016/j.patrec.2013.09.004
关键词
Human action recognition; Depth sensor; Spatio-temporal local features; Dynamic time warping; ReadingAct action dataset
For general home monitoring, a system should automatically interpret people's actions. The system should be non-intrusive, and able to deal with a cluttered background, and loose clothes. An approach based on spatio-temporal local features and a Bag-of-Words (BoW) model is proposed for single-person action recognition from combined intensity and depth images. To restore the temporal structure lost in the traditional BoW method, a dynamic time alignment technique with temporal binning is applied in this work, which has not been previously implemented in the literature for human action recognition on depth imagery. A novel human action dataset with depth data has been created using two Microsoft Kinect sensors. The ReadingAct dataset contains 20 subjects and 19 actions for a total of 2340 videos. To investigate the effect of using depth images and the proposed method, testing was conducted on three depth datasets, and the proposed method was compared to traditional Bag-of-Words methods. Results showed that the proposed method improves recognition accuracy when adding depth to the conventional intensity data, and has advantages when dealing with long actions. (C) 2013 Elsevier B.V. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据