4.7 Article

Video-based emotion recognition in the wild using deep transfer learning and score fusion

期刊

IMAGE AND VISION COMPUTING
卷 65, 期 -, 页码 66-75

出版社

ELSEVIER
DOI: 10.1016/j.imavis.2017.01.012

关键词

EmotiW; Emotion recognition in the wild; Multimodal fusion; Convolutional neural networks; Kernel extreme learning machine; Partial least squares

向作者/读者索取更多资源

Multimodal recognition of affective states is a difficult problem, unless the recording conditions are carefully controlled. For recognition in the wild, large variances in face pose and illumination, cluttered backgrounds, occlusions, audio and video noise, as well as issues with subtle cues of expression are some of the issues to target. In this paper, we describe a multimodal approach for video-based emotion recognition in the wild. We propose using summarizing functionals of complementary visual descriptors for video modeling. These features include deep convolutional neural network (CNN) based features obtained via transfer learning, for which we illustrate the importance of flexible registration and fine-tuning. Our approach combines audio and visual features with least squares regression based classifiers and weighted score level fusion. We report state-of-the-art results on the EmotiW Challenge for in the wild facial expression recognition. Our approach scales to other problems, and ranked top in the ChaLearn-LAP First Impressions Challenge 2016 from video clips collected in the wild. (C) 2017 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据