☆ 3.8 Proceedings Paper

Towards a Better Gold Standard: Denoising and Modelling Continuous Emotion Annotations Based on Feature Agglomeration and Outlier Regularisation

PROCEEDINGS OF THE 2018 AUDIO/VISUAL EMOTION CHALLENGE AND WORKSHOP (AVEC'18) (2018)

Journal

PROCEEDINGS OF THE 2018 AUDIO/VISUAL EMOTION CHALLENGE AND WORKSHOP (AVEC'18)

Volume -, Issue -, Pages 73-81

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3266302.3266307

Keywords

emotion recognition; arousal valence distribution; feature agglomeration; outlier detection; multimodal fusion

Funding

Swiss National Science Foundation [2000221E-164326]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Emotions are often perceived by humans through a series of multimodal cues, such as verbal expressions, facial expressions and gestures. In order to recognise emotions automatically, reliable emotional labels are required to learn a mapping from human expressions to corresponding emotions. Dimensional emotion models have become popular and have been widely applied for annotating emotions continuously in the time domain. However, the statistical relationship between emotional dimensions is rarely studied. This paper provides a solution to automatic emotion recognition for the Audio/Visual Emotion Challenge (AVEC) 2018. The objective is to find a robust way to detect emotions using more reliable emotion annotations in the valence and arousal dimensions. The two main contributions of this paper are: 1) the proposal of a new approach capable of generating more dependable emotional ratings for both arousal and valence from multiple annotators by extracting consistent annotation features; 2) the exploration of the valence and arousal distribution using outlier detection methods, which shows a specific oblique elliptic shape. With the learned distribution, we are able to detect the prediction outliers based on their local density deviations and correct them towards the learned distribution. The proposed method performance is evaluated on the RECOLA database containing audio, video and physiological recordings. Our results show that a moving average filter is sufficient to remove the incidental errors in annotations. The unsupervised dimensionality reduction approaches could be used to determine a gold standard annotations from multiple annotations. Compared with the baseline model of AVEC 2018, our approach improved the arousal and valence prediction of concordance correlation coefficient significantly to respectively 0.821 and 0.589.

Towards a Better Gold Standard: Denoising and Modelling Continuous Emotion Annotations Based on Feature Agglomeration and Outlier Regularisation

Journal

PROCEEDINGS OF THE 2018 AUDIO/VISUAL EMOTION CHALLENGE AND WORKSHOP (AVEC'18)

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Towards a Better Gold Standard: Denoising and Modelling Continuous Emotion Annotations Based on Feature Agglomeration and Outlier Regularisation

Journal

PROCEEDINGS OF THE 2018 AUDIO/VISUAL EMOTION CHALLENGE AND WORKSHOP (AVEC'18)

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper