4.6 Article

Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TASLP.2021.3060257

关键词

Data models; Adaptation models; Direction-of-arrival estimation; Neural networks; Location awareness; Data collection; Robots; DOA estimation; data augmentation; sound source localization; weakly-supervised learning

资金

  1. European Union under the EU Horizon 2020 Research and Innovation Action MuMMER Project (MultiModal Mall Entertainment Robot) [688147]

向作者/读者索取更多资源

This paper proposes a novel approach for multi-speaker direction-of-arrival estimation using data augmentation and weakly-supervised domain adaptation. By generating source domain data with simulation and collecting real data annotated with weak labels, the proposed method achieves similar performance as fully-labeled real data. The approach suggests an effective development procedure for DOA estimation models applied to new types of microphone arrays with minimal data collection efforts.
Deep neural networks have been successfully applied to sound direction-of-arrival estimation under challenging conditions. However, such a learning-based approach requires a large amount of labeled training data, which is difficult to acquire. To address this problem, we propose a novel approach for multi-speaker direction-of-arrival estimation with data augmentation and weakly-supervised domain adaptation. We generate source domain data with simulation, and collect real data annotated with the number of sound sources as the weak labels. The real data are further augmented by mixing single-source segments. Then, weakly-supervised domain adaptation is applied to models pre-trained on the simulated data. We define a loss function for the adaptation process which exploits the weak labels and the mixture component information in the augmented data. Experiments with real robot audio data show that our proposed approach achieves similar performance as if the fully-labeled real data are used. This paper suggests an effective development procedure for DOA estimation models applied to new types of microphone arrays with minimal data collection efforts.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据