☆ 4.7 Article

Selective spatiotemporal features learning for dynamic gesture recognition

EXPERT SYSTEMS WITH APPLICATIONS (2021)

期刊

EXPERT SYSTEMS WITH APPLICATIONS

卷 169, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.eswa.2020.114499

关键词

Dynamic gesture recognition; Deep learning; Spatiotemporal features learning; Heterogeneous network; Attention mechanism

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic Operations Research & Management Science

资金

National Natural Science Foundation of China [61673079]
Natural Science Foundation of Chongqing [cstc2018jcyjAX0160]
Science and Technology Project of Chongqing Education Committee [KJQN201902404]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper introduces a novel gesture recognition model architecture that combines the ResC3D network and ConvLSTM with a dynamic select mechanism called Selective Spatiotemporal features learning (SeST). This heterogeneous network system can simultaneously learn short-term and long-term spatiotemporal features, outperforming other methods.

YY Gesture recognition, which aims to understand meaningful movements of human bodies, plays an essential role in human-computer interaction. The key to gesture recognition is to learn compact and effective spatiotemporal information. However, it remains a challenging task due to the barriers of gesture-irrelevant factors. A number of attempts have been taken to address this problem by cascading deep heterogeneous architectures. However, this cascading strategy cannot capture both local and global spatiotemporal features at each stage of feature learning. In this paper, we propose a novel refined fusion model architecture combining the ResC3D network and Convolutional LSTM (ConvLSTM) with a dynamic select mechanism called Selective Spatiotemporal features learning (SeST). Such a heterogeneous network system is able to simultaneously learn short-term and long-term spatiotemporal features, and they are complementary to each other. The SeST block enables the ResC3D network and ConvLSTM to adaptively adjust their contributions to classification during feature learning with softattention. The method has been evaluated on the three publicly available datasets: the Sheffield Kinect Gesture (SKIG) dataset, the ChaLearn LAP large scale isolated gesture dataset (IsoGD), and the EgoGesture dataset. Experiment results show that the proposed method outperforms other state-of-the-art methods. Besides, our model is an end-to-end model, which can be embedded in many intelligent systems applications.

Selective spatiotemporal features learning for dynamic gesture recognition

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Selective spatiotemporal features learning for dynamic gesture recognition

期刊

EXPERT SYSTEMS WITH APPLICATIONS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文