☆ 4.7 Article

Multimodal 2D+3D Facial Expression Recognition With Deep Fusion Convolutional Neural Network

IEEE TRANSACTIONS ON MULTIMEDIA (2017)

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Volume 19, Issue 12, Pages 2816-2831

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2017.2713408

Keywords

Deep fusion convolutional neural network (DF-CNN); facial expression recognition (FER); multimodal; textured three-dimensional (3D) face scan

Funding

NSFC [11401464, 61472313, 11622106]
Chinese Postdoctoral Science Foundation [2014M560785]
International Exchange Foundation of China NSFC
United Kingdom RS Grant [61711530242]
French Research Agency, l'Agence Nationale de Recherche through Jemime project [ANR-13-CORD-0004-02]
French Research Agency, l'Agence Nationale de Recherche through Biofence project [ANR-13-INSE-0004-02]
PUF 4D Vision project - Partner University Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

This paper presents a novel and efficient deep fusion convolutional neural network (DF-CNN) for multimodal 2D+3D facial expression recognition (FER). DF-CNN comprises a feature extraction subnet, a feature fusion subnet, and a softmax layer. In particular, each textured three-dimensional (3D) face scan is represented as six types of 2D facial attribute maps (i.e., geometry map, three normal maps, curvature map, and texture map), all of which are jointly fed into DF-CNN for feature learning and fusion learning, resulting in a highly concentrated facial representation (32-dimensional). Expression prediction is performed by two ways: 1) learning linear support vector machine classifiers using the 32-dimensional fused deep features, or 2) directly performing softmax prediction using the six-dimensional expression probability vectors. Different from existing 3D FER methods, DF-CNN combines feature learning and fusion learning into a single end-to-end training framework. To demonstrate the effectiveness of DF-CNN, we conducted comprehensive experiments to compare the performance of DF-CNN with handcrafted features, pre-trained deep features, fine-tuned deep features, and state-of-the-art methods on three 3D face datasets (i.e., BU-3DFE Subset I, BU-3DFE Subset II, and Bosphorus Subset). In all cases, DF-CNN consistently achieved the best results. To the best of our knowledge, this is the first work of introducing deep CNN to 3D FER and deep learning-based feature level fusion for multimodal 2D+3D FER.

Multimodal 2D+3D Facial Expression Recognition With Deep Fusion Convolutional Neural Network

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multimodal 2D+3D Facial Expression Recognition With Deep Fusion Convolutional Neural Network

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper