☆ 4.6 Article

Multilevel Depth and Image Fusion for Human Activity Detection

IEEE TRANSACTIONS ON CYBERNETICS (2013)

期刊

IEEE TRANSACTIONS ON CYBERNETICS

卷 43, 期 5, 页码 1383-1394

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCYB.2013.2276433

关键词

Action recognition and localization; depth sensor; spatial and temporal context

类别

Automation & Control Systems Computer Science, Artificial Intelligence Computer Science, Cybernetics

资金

Human Sixth Sense Program at the Advanced Digital Sciences Center from Singapore's Agency for Science, Technology and Research (A*STAR)
Singapore National Research Foundation under its International Research Centre at Singapore

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.

Multilevel Depth and Image Fusion for Human Activity Detection

期刊

IEEE TRANSACTIONS ON CYBERNETICS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multilevel Depth and Image Fusion for Human Activity Detection

期刊

IEEE TRANSACTIONS ON CYBERNETICS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文