3.8 Proceedings Paper

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

期刊

INTERSPEECH 2022
卷 -, 期 -, 页码 5343-5347

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC
DOI: 10.21437/Interspeech.2022-509

关键词

speech separation; domain mismatch

向作者/读者索取更多资源

Because of the excellent performance of speech separation in cases of complete speaker overlap, the focus of research has shifted towards dealing with more realistic scenarios. However, domain mismatch between training and testing situations remains a significant problem due to various factors. This study investigates the impacts of language and channel mismatches on speech separation and proposes a new solution for channel mismatch using projection evaluation.
Because the performance of speech separation is excellent for speech in which two speakers completely overlap, research attention has been shifted to dealing with more realistic scenarios. However, domain mismatch between training/test situations due to factors, such as speaker, content, channel, and environment, remains a severe problem for speech separation. Speaker and environment mismatches have been studied in the existing literature. Nevertheless, there are few studies on speech content and channel mismatches. Moreover, the impacts of language and channel in these studies are mostly tangled. In this study, we create several datasets for various experiments. The results show that the impacts of different languages are small enough to be ignored compared to the impacts of different channels. In our experiments, training on data recorded by Android phones leads to the best generalizability. Moreover, we provide a new solution for channel mismatch by evaluating projection, where the channel similarity can be measured and used to effectively select additional training data to improve the performance of in-the-wild test data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据