4.7 Article

Survey of Deep Representation Learning for Speech Emotion Recognition

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Computer Science, Artificial Intelligence

Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition

Siddique Latif et al.

Summary: In this article, a solution to improve the low accuracy in Speech Emotion Recognition (SER) is proposed. By utilizing auxiliary tasks, large datasets, and an adversarial autoencoder (AAE) for semi-supervised learning, the SER performance is significantly improved, achieving state-of-the-art results.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2022)

Article Computer Science, Artificial Intelligence

Modeling Feature Representations for Affective Speech Using Generative Adversarial Networks

Saurabh Sahu et al.

Summary: This article explores various GAN architectures for generating feature vectors corresponding to emotions and proposes different metrics for measuring the performance of GAN models in generating realistic synthetic samples.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2022)

Article Computer Science, Artificial Intelligence

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey

Longlong Jing et al.

Summary: This paper reviews deep learning-based self-supervised general visual feature learning methods, covering motivation, pipeline, architectures, schema, evaluation metrics, datasets, performance comparisons, and future directions.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2021)

Article Computer Science, Artificial Intelligence

Improving Cross-Corpus Speech Emotion Recognition with Adversarial Discriminative Domain Generalization (ADDoG)

John Gideon et al.

Summary: Automatic speech emotion recognition provides computers with important context for user understanding. While current methods often fail when applied to unseen datasets, recent research has focused on adversarial methods to create more generalized representations of emotional speech. The introduced Adversarial Discriminative Domain Generalization (ADDoG) method improves cross-dataset generalization by iteratively moving representations learned for each dataset closer to one another.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2021)

Proceedings Paper Computer Science, Information Systems

Privacy Enhanced Speech Emotion Communication using Deep Learning Aided Edge Computing

Hafiz Shehbaz Ali et al.

Summary: This research introduces a privacy-enhanced emotion communication system to protect users' personal information in emotion-sensing applications. By using an adversarial learning framework, private information in speech representations can be unlearned to enhance the robustness of emotion identification.

2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS) (2021)

Article Engineering, Biomedical

Speech Technology for Healthcare: Opportunities, Challenges, and State of the Art

Siddique Latif et al.

Summary: This paper discusses the great potential of speech technology in the healthcare industry, reviewing the latest approaches in automatic speech recognition, speech synthesis, and health detection using speech signals. It also presents various challenges hindering the growth of speech-based healthcare services and suggests possible research directions to fully leverage the advantages of other technologies for more effective solutions.

IEEE REVIEWS IN BIOMEDICAL ENGINEERING (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Unsupervised Cross-Lingual Speech Emotion Recognition Using Domain Adversarial Neural Network

Xiong Cai et al.

Summary: The proposed method utilizes a Domain Adversarial Neural Network (DANN) to address the distribution shift problem in cross-lingual Speech Emotion Recognition (SER), resulting in improved model performance. Experimental results demonstrate a significant improvement in arousal and valence classification tasks compared to the baseline system.

2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

SELF-SUPERVISED LEARNING WITH CROSS-MODAL TRANSFORMERS FOR EMOTION RECOGNITION

Aparna Khare et al.

Summary: The study extended self-supervised training to multi-modal applications by using a transformer model trained on a masked language modeling task with audio, visual, and text features to learn multi-modal representations. Results showed that this pre-training technique can improve emotion recognition performance.

2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

ON THE USE OF SELF-SUPERVISED PRE-TRAINED ACOUSTIC AND LINGUISTIC FEATURES FOR CONTINUOUS SPEECH EMOTION RECOGNITION

Manon Macary et al.

Summary: Pre-training for feature extraction using wav2vec and camemBERT models shows to be effective for continuous emotion recognition from speech (SER). The joint use of wav2vec and BERT-like pre-trained features is relevant for dealing with SER tasks with limited labeled data, achieving higher CCC values compared to traditional methods like MFCC and word2vec word embedding.

2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT) (2021)

Proceedings Paper Acoustics

VISUALLY GUIDED SELF SUPERVISED LEARNING OF SPEECH REPRESENTATIONS

Abhinav Shukla et al.

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (2020)

Article Computer Science, Information Systems

Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion

Shamane Siriwardhana et al.

IEEE ACCESS (2020)

Article Acoustics

Semi-Supervised Speech Emotion Recognition With Ladder Networks

Srinivas Parthasarathy et al.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2020)

Article Computer Science, Artificial Intelligence

Learning Class-Aligned and Generalized Domain-Invariant Representations for Speech Emotion Recognition

Yufeng Xiao et al.

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE (2020)

Article Computer Science, Artificial Intelligence

Cross-Corpus Acoustic Emotion Recognition with Multi-Task Learning: Seeking Common Ground While Preserving Differences

Biqiao Zhang et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2019)

Article Computer Science, Artificial Intelligence

Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech from Existing Podcast Recordings

Reza Lotfian et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2019)

Article Computer Science, Artificial Intelligence

Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition

Yongming Huang et al.

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING (2019)

Review Computer Science, Artificial Intelligence

From shallow feature learning to deep learning: Benefits from the width and depth of deep architectures

Guoqiang Zhong et al.

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY (2019)

Review Computer Science, Information Systems

Safe semi-supervised learning: a brief introduction

Yu-Feng Li et al.

FRONTIERS OF COMPUTER SCIENCE (2019)

Article Engineering, Electrical & Electronic

Caveat Emptor The Risks of Using Big Data for Human Development

Siddique Latif et al.

IEEE TECHNOLOGY AND SOCIETY MAGAZINE (2019)

Article Computer Science, Artificial Intelligence

Adversarial Examples: Attacks and Defenses for Deep Learning

Xiaoyong Yu et al.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2019)

Article Acoustics

Speech emotion recognition based on DNN-decision tree SVM model

Linhui Sun et al.

SPEECH COMMUNICATION (2019)

Article Acoustics

Unsupervised Speech Representation Learning Using WaveNet Autoencoders

Jan Chorowski et al.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2019)

Proceedings Paper Computer Science, Artificial Intelligence

A COMPARISON OF TRANSFORMER AND LSTM ENCODER DECODER MODELS FOR ASR

Albert Zeyer et al.

2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

A COMPARATIVE STUDY ON TRANSFORMER VS RNN IN SPEECH APPLICATIONS

Shigeki Karita et al.

2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019) (2019)

Article Automation & Control Systems

Semi-supervised Ladder Networks for Speech Emotion Recognition

Jian-Hua Tao et al.

INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING (2019)

Article Computer Science, Information Systems

Diversity in Machine Learning

Zhiqiang Gong et al.

IEEE ACCESS (2019)

Article Acoustics

Semisupervised Autoencoders for Speech Emotion Recognition

Jun Deng et al.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2018)

Article Computer Science, Information Systems

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Naveed Akhtar et al.

IEEE ACCESS (2018)

Article Computer Science, Information Systems

Leveraging Unlabeled Data for Emotion Recognition With Enhanced Collaborative Semi-Supervised Learning

Zixing Zhang et al.

IEEE ACCESS (2018)

Article Acoustics

Domain Adversarial for Acoustic Emotion Recognition

Mohammed Abdelwahab et al.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2018)

Proceedings Paper Computer Science, Artificial Intelligence

On Enhancing Speech Emotion Recognition using Generative Adversarial Networks

Saurabh Sahu et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Emotion Identification from raw speech signals using DNNs

Mousmita Sarma et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Ladder Networks for Emotion Recognition: Using Unsupervised Auxiliary Tasks to Improve Predictions of Emotional Attributes

Srinivas Parthasarathy et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Predicting Categorical Emotions by Jointly Learning Primary and Secondary Emotions Through Multitask Learning

Reza Lotfian et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Transfer Learning for Improving Speech Emotion Classification Accuracy

Siddique Latif et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Variational Autoencoders to Learn Latent Representations of Speech Emotion

Siddique Latif et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Proceedings Paper Computer Science, Artificial Intelligence

The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild

Soheil Khorram et al.

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

Review Engineering, Electrical & Electronic

Databases, features and classifiers for speech emotion recognition: a review

Monorama Swain et al.

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY (2018)

Article Engineering, Electrical & Electronic

Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition

Jun Deng et al.

IEEE SIGNAL PROCESSING LETTERS (2017)

Article Engineering, Electrical & Electronic

Deep Reinforcement Learning A brief survey

Kai Arulkumaran et al.

IEEE SIGNAL PROCESSING MAGAZINE (2017)

Article Computer Science, Artificial Intelligence

Evaluating deep learning architectures for Speech Emotion Recognition

Haytham M. Fayek et al.

NEURAL NETWORKS (2017)

Article Computer Science, Artificial Intelligence

MSP-IMPROV: An Acted Corpus of Dyadic Interactions to Study Emotion Perception

Carlos Busso et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2017)

Article Computer Science, Artificial Intelligence

A Multi-Task Learning Framework for Emotion Recognition Using 2D Continuous Space

Rui Xia et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Densely Connected Convolutional Networks

Gao Huang et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Article Computer Science, Artificial Intelligence

Non-Local Auto-Encoder With Collaborative Stabilization for Image Restoration

Ruxin Wang et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2016)

Article Computer Science, Artificial Intelligence

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing

Florian Eyben et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2016)

Article Computer Science, Information Systems

A Novel DBN Feature Fusion Model for Cross-Corpus Speech Emotion Recognition

Zou Cairong et al.

JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING (2016)

Proceedings Paper Acoustics

Representation Learning for Speech Emotion Recognition

Sayan Ghosh et al.

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES (2016)

Proceedings Paper Acoustics

Unsupervised Deep Auditory Model Using Stack of Convolutional RBMs For Speech Recognition

Hardik B. Sailor et al.

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES (2016)

Article Acoustics

Noisy training for deep neural networks in speech recognition

Shi Yin et al.

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING (2015)

Article Computer Science, Information Systems

Speech emotion recognition with unsupervised feature learning

Zheng-wei Huang et al.

FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING (2015)

Proceedings Paper Computer Science, Artificial Intelligence

Fast R-CNN

Ross Girshick

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2015)

Article Engineering, Electrical & Electronic

Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition

Jun Deng et al.

IEEE SIGNAL PROCESSING LETTERS (2014)

Article Computer Science, Information Systems

Learning Salient Features for Speech Emotion Recognition Using Convolutional Neural Networks

Qirong Mao et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2014)

Article Engineering, Multidisciplinary

A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM

Chenchen Huang et al.

MATHEMATICAL PROBLEMS IN ENGINEERING (2014)

Review Computer Science, Artificial Intelligence

A review of unsupervised feature learning and deep learning for time-series modeling

Martin Langkvist et al.

PATTERN RECOGNITION LETTERS (2014)

Article Engineering, Electrical & Electronic

A comparative analysis of classifiers in emotion recognition through acoustic features

Swarna Kuchibhotla et al.

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY (2014)

Proceedings Paper Computer Science, Interdisciplinary Applications

Incomplete Big Data Clustering Algorithm Using Feature Selection and Partial Distance

Fanyu Bu et al.

2014 5TH INTERNATIONAL CONFERENCE ON DIGITAL HOME (ICDH) (2014)

Article Engineering, Electrical & Electronic

Privacy-Preserving Speech Processing

Manas A. Pathak et al.

IEEE SIGNAL PROCESSING MAGAZINE (2013)

Review Computer Science, Artificial Intelligence

Representation Learning: A Review and New Perspectives

Yoshua Bengio et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2013)

Review Robotics

Reinforcement learning in robotics: A survey

Jens Kober et al.

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH (2013)

Proceedings Paper Computer Science, Artificial Intelligence

Sparse Autoencoder-based Feature Transfer Learning for Speech Emotion Recognition

Jun Deng et al.

2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII) (2013)

Proceedings Paper Computer Science, Artificial Intelligence

Hybrid Deep Neural Network - Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition

Longfei Li et al.

2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII) (2013)

Article Computer Science, Artificial Intelligence

The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent

Gary McKeown et al.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2012)

Article Computer Science, Artificial Intelligence

The Effect of Model Misspecification on Semi-Supervised Classification

Ting Yang et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2011)

Article Computer Science, Artificial Intelligence

Survey on speech emotion recognition: Features, classification schemes, and databases

Moataz El Ayadi et al.

PATTERN RECOGNITION (2011)

Review Computer Science, Artificial Intelligence

A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions

Zhihong Zeng et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2009)

Article Computer Science, Artificial Intelligence

Learning Deep Architectures for AI

Yoshua Bengio

FOUNDATIONS AND TRENDS IN MACHINE LEARNING (2009)

Article Computer Science, Interdisciplinary Applications

IEMOCAP: interactive emotional dyadic motion capture database

Carlos Busso et al.

LANGUAGE RESOURCES AND EVALUATION (2008)

Article Multidisciplinary Sciences

Reducing the dimensionality of data with neural networks

G. E. Hinton et al.

SCIENCE (2006)

Article Computer Science, Artificial Intelligence

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton et al.

NEURAL COMPUTATION (2006)

Article Acoustics

Toward detecting emotions in spoken dialogs

CM Lee et al.

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING (2005)

Article Computer Science, Artificial Intelligence

Canonical correlation analysis: An overview with application to learning methods

DR Hardoon et al.

NEURAL COMPUTATION (2004)

Article Multidisciplinary Sciences

Nonlinear dimensionality reduction by locally linear embedding

ST Roweis et al.

SCIENCE (2000)

Article Multidisciplinary Sciences

A global geometric framework for nonlinear dimensionality reduction

JB Tenenbaum et al.

SCIENCE (2000)

Article Computer Science, Artificial Intelligence

Generalized discriminant analysis using a kernel approach

G Baudat et al.

NEURAL COMPUTATION (2000)

Article Computer Science, Artificial Intelligence

Independent component analysis:: algorithms and applications

A Hyvärinen et al.

NEURAL NETWORKS (2000)