☆ 4.7 Article

Contrastive Adversarial Domain Adaptation Networks for Speaker Recognition

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 33, 期 5, 页码 2236-2245

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2020.3044215

关键词

Feature extraction; Training; Task analysis; Speaker recognition; Data models; Adaptation models; Data mining; Adversarial learning; domain adaptation; domain adversarial networks (DANs); domain invariance; speaker recognition

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

Research Grant Council (RGC) of Hong Kong SAR [PolyU152113/17E]
Taiwan Ministry of Science and Technology (MOST) [109-2634-F-009-024]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this article, a contrastive adversarial domain adaptation network (CADAN) is proposed to address the difficulty of finding domain-invariant space in a single feature extractor. By splitting the feature extractor into two branches, CADAN achieves class-dependence and domain-invariance simultaneously. Evaluation experiments demonstrate that CADAN outperforms conventional methods in terms of speaker identification accuracy.

Domain adaptation aims to reduce the mismatch between the source and target domains. A domain adversarial network (DAN) has been recently proposed to incorporate adversarial learning into deep neural networks to create a domain-invariant space. However, DAN's major drawback is that it is difficult to find the domain-invariant space by using a single feature extractor. In this article, we propose to split the feature extractor into two contrastive branches, with one branch delegating for the class-dependence in the latent space and another branch focusing on domain-invariance. The feature extractor achieves these contrastive goals by sharing the first and last hidden layers but possessing decoupled branches in the middle hidden layers. For encouraging the feature extractor to produce class-discriminative embedded features, the label predictor is adversarially trained to produce equal posterior probabilities across all of the outputs instead of producing one-hot outputs. We refer to the resulting domain adaptation network as contrastive adversarial domain adaptation network (CADAN). We evaluated the embedded features' domain-invariance via a series of speaker identification experiments under both clean and noisy conditions. Results demonstrate that the embedded features produced by CADAN lead to a 33% improvement in speaker identification accuracy compared with the conventional DAN.

Contrastive Adversarial Domain Adaptation Networks for Speaker Recognition

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Contrastive Adversarial Domain Adaptation Networks for Speaker Recognition

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文