☆ 4.7 Article

Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing

NEURAL NETWORKS (2022)

期刊

NEURAL NETWORKS

卷 148, 期 -, 页码 194-205

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.neunet.2022.01.016

关键词

Data augmentation; Speech translation; Paraphrasing

类别

Computer Science, Artificial Intelligence Neurosciences

资金

National Natural Science Foundation of China [61906158]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study proposes a target-side data augmentation method for low-resource language speech translation, which generates large-scale target-side paraphrases to improve the translation quality. Experimental results show significant and consistent improvements on the PPDB datasets in various languages.

High quality end-to-end speech translation model relies on a large scale of speech-to-text training data, which is usually scarce or even unavailable for some low-resource language pairs. To overcome this, we propose a target-side data augmentation method for low-resource language speech translation. In particular, we first generate large-scale target-side paraphrases based on a paraphrase generation model which incorporates several statistical machine translation (SMT) features and the commonly used recurrent neural network (RNN) feature. Then, a filtering model which consists of semantic similarity and speech-word pair co-occurrence was proposed to select the highest scoring source speech-target paraphrase pairs from candidates. Experimental results on English, Arabic, German, Latvian, Estonian, Slovenian and Swedish paraphrase generation show that the proposed method achieves significant and consistent improvements over several strong baseline models on PPDB datasets (http://paraphrase. org/). To introduce the results of paraphrase generation into the low-resource speech translation, we propose two strategies: audio-text pairs recombination and multiple references training. Experimental results show that the speech translation models trained on new audio-text datasets which combines the paraphrase generation results lead to substantial improvements over baselines, especially on low-resource languages. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.

Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文