期刊
NEUROCOMPUTING
卷 275, 期 -, 页码 438-447出版社
ELSEVIER
DOI: 10.1016/j.neucom.2017.08.063
关键词
Multiple feature learning; Deep learning; Autoencoder; Egocentric video; Activity recognition
资金
- National Natural Science Foundation of China [61502080, 61632007]
- Fundamental Research Funds for the Central Universities [ZYGX2016J085, ZYGX2014Z007]
Egocentric activity recognition has recently generated great popularity in computer vision due to its widespread applications in egocentric video analysis. However, it poses new challenges comparing to the conventional third-person activity recognition tasks, which are caused by significant body shaking, varied lengths, and poor recoding quality, etc. To handle these challenges, in this paper, we propose deep appearance and motion learning (DAML) for egocentric activity recognition, which leverages the great strength of deep learning networks in feature learning. In contrast to hand- crafted visual features or pre-trained convolutional neural network (CNN) features with limited generality to new egocentric videos, the proposed DAML is built on the deep autoencoder (DAE), and directly extracts appearance and motion feature, the main cue of activities, from egocentric videos. The DAML takes advantages of the great effectiveness and efficiency of the DAE in unsupervised feature learning, which provides a new representation learning framework of egocentric videos. The learned appearance and motion features by the DAML are seamlessly fused to accomplish a rich informative egocentric activity representation which can be readily fed into any supervised learning models for activity recognition. Experimental results on two challenging benchmark datasets show that the DAML achieves high performance on both short- and long-term egocentric activity recognition tasks, which is comparable to or even better than the state-of-the-art counterparts. (C) 2017 Elsevier B.V. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据