4.6 Article

Learning Local Attention With Guidance Map for Pose Robust Facial Expression Recognition

Journal

IEEE ACCESS
Volume 10, Issue -, Pages 85929-85940

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3198658

Keywords

Feature extraction; Face recognition; Convolutional neural networks; Training data; Facial features; Annotations; Facial expression recognition; pose robust; local attention; guidance map

Ask authors/readers for more resources

Facial expression recognition (FER) is a challenging task, especially under unconstrained conditions with variant head poses. To address this problem, the authors propose a local attention network (LAN) that adaptively captures important facial regions based on pose variations. The LAN improves FER performance by emphasizing attentive regions and suppressing regions that are not differentiated between classes. Experiments on multiple datasets demonstrate the effectiveness of the LAN and its superiority compared to previous methods.
Facial expression recognition (FER) is an extremely challenging task under unconstrained conditions. Especially, variant head poses degrade the performance dramatically due to the large variations in appearance of facial expressions. To address this problem, we propose a local attention network (LAN), which adaptively captures the important facial regions according to pose variations. The LAN emphasizes on more attentive regions while suppressing the regions not differentiated between classes. To find out attentive regions, we propose a simple yet efficient coarse-level attention guidance map annotation method in an unsupervised manner. The guidance map includes attention values for regions based on whether features are deformed by facial poses. Further, the attentive regional features obtained by our LAN and original global features are combined for pose-invariant FER. We validate our method on a controlled multiview dataset, KDEF, three popular in- the-wild datasets, RAF-DB, FERPlus, and AffectNet, and their subsets that contain images under pose variation conditions. Extensive experiments show that our LAN largely improves the performance of FER under pose variations. Our method also performs favorably against the previous methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available