期刊
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE
卷 4, 期 4, 页码 480-489出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TETCI.2020.2972926
关键词
Domain adversarial training; generalization; class alignment; speech emotion recognition
资金
- National Science Foundation of China [61772188]
- National Key R&D Program of China [2018YFC0831800]
Although recent research on speech emotion recognition has demonstrated that learning domain-invariant features provide an elegant solution to domain mismatch, the features learned by the existing methods lack generalization capabilities to capture latent information from datasets. We propose two novel domain adaptation methods, the generalized domain adversarial neural network (GDANN) and the class-aligned GDANN (CGDANN), to learn generalized domain-invariant representations for emotion recognition. GDANN and CGDANN, which are derived from multitask learning (MTL), consist of three tasks. The main task is to recognize the emotional category to which the input belongs. The remaining two tasks are auxiliary tasks. One is to use a variational autoencoder to model the input distribution, which encourages the model to learn the distribution of latent representations. The other is to learn the common representations of different domains, for which distinguishing via the domain classifier is difficult. The gradient of the domain classifier guides the shared representations of the source and target domains to approximate each other using a gradient reversal layer. To evaluate the effectiveness of the proposed methods, we conduct several experiments with the IEMOCAP and MSP-IMPROV datasets. The results illustrate that good performance is achieved compared with that of state-of-the-art methods. Notably, CGDANN utilizes a small quantity of labeled target domain samples to align the distribution representation and obtains the hest performance among the comparison methods. We further visualize the representations learned by the proposed methods and discover that the representations of the source and target domains converge with a low variance.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据