☆ 4.3 Article

3D SMoSIFT: three-dimensional sparse motion scale invariant feature transform for activity recognition from RGB-D videos

JOURNAL OF ELECTRONIC IMAGING (2014)

期刊

JOURNAL OF ELECTRONIC IMAGING

卷 23, 期 2, 页码 -

出版社

SPIE-SOC PHOTO-OPTICAL INSTRUMENTATION ENGINEERS

DOI: 10.1117/1.JEI.23.2.023017

关键词

three-dimensional sparse motion scale-invariant feature transform; bag of words model; spatiotemporal feature; optical flow; RGB-D data

类别

Engineering, Electrical & Electronic Optics Imaging Science & Photographic Technology

资金

National Natural Science Foundation of China [61172128]
National Key Basic Research Program of China [2012CB316304]
New Century Excellent Talents in University [NCET-12-0768]
fundamental research funds for the central universities [2013JBZ003]
Program for Innovative Research Team in University of Ministry of Education of China [IRT201206]
Beijing Higher Education Young Elite Teacher Project [YETP0544]
Research Fund for the Doctoral Program of Higher Education of China [20120009110008]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Human activity recognition based on RGB-D data has received more attention in recent years. We propose a spatiotemporal feature named three-dimensional (3D) sparse motion scale-invariant feature transform (SIFT) from RGB-D data for activity recognition. First, we build pyramids as scale space for each RGB and depth frame, and then use Shi-Tomasi corner detector and sparse optical flow to quickly detect and track robust keypoints around the motion pattern in the scale space. Subsequently, local patches around keypoints, which are extracted from RGB-D data, are used to build 3D gradient and motion spaces. Then SIFT-like descriptors are calculated on both 3D spaces, respectively. The proposed feature is invariant to scale, transition, and partial occlusions. More importantly, the running time of the proposed feature is fast so that it is well-suited for real-time applications. We have evaluated the proposed feature under a bag of words model on three public RGB-D datasets: one-shot learning Chalearn Gesture Dataset, Cornell Activity Dataset-60, and MSR Daily Activity 3D dataset. Experimental results show that the proposed feature outperforms other spatiotemporal features and are comparative to other state-of-the-art approaches, even though there is only one training sample for each class. (C) 2014 SPIE and IS&T

3D SMoSIFT: three-dimensional sparse motion scale invariant feature transform for activity recognition from RGB-D videos

期刊

JOURNAL OF ELECTRONIC IMAGING

出版社

SPIE-SOC PHOTO-OPTICAL INSTRUMENTATION ENGINEERS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

3D SMoSIFT: three-dimensional sparse motion scale invariant feature transform for activity recognition from RGB-D videos

期刊

JOURNAL OF ELECTRONIC IMAGING

出版社

SPIE-SOC PHOTO-OPTICAL INSTRUMENTATION ENGINEERS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文