4.7 Article

Patch attention convolutional vision transformer for facial expression recognition with occlusion

Journal

INFORMATION SCIENCES
Volume 619, Issue -, Pages 781-794

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2022.11.068

Keywords

Facial expression recognition; Occlusion; Local and global feature; Self-attention; Vision transformer

Ask authors/readers for more resources

A Patch Attention Convolutional Vision Transformer (PACVT) is proposed to tackle the occlusion problem in Facial Expression Recognition (FER). It extracts local and global features from facial patches and uses self-attention to focus on salient patches with discriminative features. Experimental results demonstrate the superiority of PACVT in occlusion FER.
Despite substantial progress in Facial Expression Recognition (FER) in recent decades, most previous methods have been developed to recognize constrained facial expressions. Realworld occlusions lead to invisible facial regions and contaminated facial features, which undoubtedly increase the difficulty of FER in the wild. Therefore, a Patch Attention Convolutional Vision Transformer (PACVT) is proposed to tackle the occlusion FER problem. The backbone convolutional neural network is used to extract facial feature maps, which are cropped into multiple regional patches to extract local and global features. The Patch Attention Unit (PAU) is designed to perceive occluded regions by adaptively calculating the patch-level attention weights of local features for expression recognition. The facial patches are mapped into sequences of visual tokens, and the Vision Transformer (ViT) is employed to capture the interactions and correlations between these visual tokens from a global perspective. The self-attention in ViT enables the PACVT to focus on the salient patches with discriminative features and ignore the occlusion. Experiments are conducted on three widely used expression datasets and their occlusion subsets, and the results demonstrate that the proposed PACVT outperforms state-of-the-art methods on occlusion FER. Cross-dataset experiment results evidence the generalization ability of the PACVT. (c) 2022 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available