4.6 Review

Deep Multimodal Emotion Recognition on Human Speech: A Review

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild

Jean Kossaifi et al.

Summary: Natural human-computer interaction and audio-visual human behaviour sensing systems are more important than ever, as digital devices become increasingly integral to our lives. The SEWA database provides a valuable resource with over 2000 minutes of audio-visual data from 398 individuals representing six cultures, aiding research in affective computing and automatic human sensing.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2021)

Article Computer Science, Artificial Intelligence

What makes the difference? An empirical comparison of fusion strategies for multimodal language analysis

Dimitris Gkoumas et al.

Summary: The study compares eleven state-of-the-art modality fusion methods in video sentiment analysis and finds that attention mechanisms are effective but computationally expensive. Additional levels of crossmodal interaction decrease performance. Positive sentiment utterances are the most challenging cases for all approaches, and integrating linguistic modality as a pivot for non-verbal modalities improves performance.

INFORMATION FUSION (2021)

Article Acoustics

Analyzing Multimodal Sentiment Via Acoustic- and Visual-LSTM With Channel-Aware Temporal Convolution Network

Sijie Mai et al.

Summary: The study focuses on learning inter-modality dynamics through acoustic- and visual-LSTMs where language features play a dominant role. A 'channel-aware' temporal convolution network is introduced in the unimodal representation learning stage to extract high-level representations for each modality.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2021)

Article Engineering, Biomedical

Analysis of speech features and personality traits

A. Guidi et al.

BIOMEDICAL SIGNAL PROCESSING AND CONTROL (2019)

Article Computer Science, Artificial Intelligence

AFEW-VA database for valence and arousal estimation in-the-wild

Jean Kossaifi et al.

IMAGE AND VISION COMPUTING (2017)

Article Computer Science, Artificial Intelligence

Evaluating deep learning architectures for Speech Emotion Recognition

Haytham M. Fayek et al.

NEURAL NETWORKS (2017)

Article Computer Science, Artificial Intelligence

Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech

Houwei Cao et al.

COMPUTER SPEECH AND LANGUAGE (2015)

Review Biochemistry & Molecular Biology

The Human Face as a Dynamic Tool for Social Communication

Rachael E. Jack et al.

CURRENT BIOLOGY (2015)

Article Audiology & Speech-Language Pathology

Voice emotion recognition by cochlear-implanted children and their normally-hearing peers

Monita Chatterjee et al.

HEARING RESEARCH (2015)

Article Computer Science, Artificial Intelligence

The MAHNOB Mimicry Database: A database of naturalistic human interactions

Sanjay Bilakhia et al.

PATTERN RECOGNITION LETTERS (2015)

Article Multidisciplinary Sciences

pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis

Theodoros Giannakopoulos

PLOS ONE (2015)

Article Engineering, Electrical & Electronic

Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition

Jun Deng et al.

IEEE SIGNAL PROCESSING LETTERS (2014)

Article Computer Science, Information Systems

Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks

Qirong Mao et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2014)

Article Engineering, Electrical & Electronic

A comparative analysis of classifiers in emotion recognition through acoustic features

Swarna Kuchibhotla et al.

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY (2014)

Article Computer Science, Hardware & Architecture

Collecting Large, Richly Annotated Facial-Expression Databases from Movies

Abhinav Dhall et al.

IEEE MULTIMEDIA (2012)

Article Neurosciences

Neural Synchronization during Face-to-Face Communication

Jing Jiang et al.

JOURNAL OF NEUROSCIENCE (2012)

Article Computer Science, Artificial Intelligence

The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent

Gary McKeown et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2012)

Review Engineering, Electrical & Electronic

Emotion recognition from speech: a review

Shashidhar G. Koolagudi et al.

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY (2012)

Article Computer Science, Interdisciplinary Applications

IEMOCAP: interactive emotional dyadic motion capture database

Carlos Busso et al.

LANGUAGE RESOURCES AND EVALUATION (2008)

Article Acoustics

Nonlinear feature based classification of speech under stress

GJ Zhou et al.

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING (2001)