3.8 Proceedings Paper

EMOTIONAL SPEECH SYNTHESIS WITH RICH AND GRANULARIZED CONTROL

出版社

IEEE
DOI: 10.1109/icassp40776.2020.9053732

关键词

emotional TTS; emotion intensity control; end-to-end GST-Tacotron2

资金

  1. Institute for Information communications Technology Promotion (IITP) - Korea government (MSIT) [2019-0-00447]

向作者/读者索取更多资源

This paper proposes an effective emotion control method for an end-to-end text-to-speech (TTS) system. To flexibly control the distinct characteristic of a target emotion category, it is essential to determine embedding vectors representing the TTS input. We introduce an inter-to-intra emotional distance ratio algorithm to the embedding vectors that can minimize the distance to the target emotion category while maximizing its distance to the other emotion categories. To further enhance the expressiveness of a target speech, we also introduce an effective interpolation technique that enables the intensity of a target emotion to be gradually changed to that of neutral speech. Subjective evaluation results in terms of emotional expressiveness and controllability show the superiority of the proposed algorithm to the conventional methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据