4.5 Article

On the issues of intra-speaker variability and realism in speech, speaker, and language recognition tasks

期刊

SPEECH COMMUNICATION
卷 101, 期 -, 页码 94-108

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/j.specom.2018.05.004

关键词

Realism; Speech recognition; Human-computer interaction; Computational paralinguistics

资金

  1. AFRL [FA8750-15-1-0205]
  2. University of Texas at Dallas from the Distinguished University Chair in Telecommunications Engineering

向作者/读者索取更多资源

Recent years have witnessed notable advancements in the areas of speech, speaker and language/dialect recognition. However, many of the emerging scientific principles appear to be drifting to the sidelines with the assumption that access to larger amounts of data is all that is required to address a growing range of issues relating to new scenarios. This study surveys several challenging domains in formulating effective solutions in realistic speech data, and in particular the notion of using naturalistic data to better reflect the potential effectiveness of new algorithms. Our main focus is on intra-speaker mismatch and speech variability issues due to (i) differences in noisy speech with and without Lombard effect and a communication factor, (ii) realistic field data in noisy and increased cognitive load conditions, (iii) speech variability introduced by whispered speech, and (iv) dialect identification using found data. Finally, we study speaker environment and speaker speaker interactions in a newly established, fully naturalistic Prof-Life-Log corpus. The specific outcomes from this study include an analysis of the strengths and weaknesses of simulated vs. actual speech data collection for research.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据