3.8 Proceedings Paper

Generating Diverse Back-Translations via Constraint Random Decoding

Journal

MACHINE TRANSLATION, CCMT 2021
Volume 1464, Issue -, Pages 92-104

Publisher

SPRINGER-VERLAG SINGAPORE PTE LTD
DOI: 10.1007/978-981-16-7512-6_8

Keywords

NMT; Back-translation; Automatic post-editing; Evolution decoding algorithm

Funding

  1. National Natural Science Foundation of China [62076211, U1908216, 61573294]
  2. Outstanding Achievement Late Fund of the State Language Commission of China [WT135-38]

Ask authors/readers for more resources

Back-translation is an effective data augmentation method for improving the performance of Neural Machine Translation (NMT). By proposing a constraint random decoding method and using an evolution decoding algorithm, more diverse synthetic sentences can be generated while maintaining quality.
Back-translation has been proven to be an effective data augmentation method that translates target monolingual data into source-side to improve the performance of Neural Machine Translation (NMT), especially in low-resource scenarios. Previous researches show that diversity of the synthetic source sentences is essential for back-translation. However, the frequently used random methods such as sampling or noised beam search, although can output diverse back-translations, often generate noisy synthetic sentences. To alleviate this problem, we propose a simple but effective constraint random decoding method for back-translation. The proposed method is based on an automatic post-editing (APE) data augment framework, which incorporates fluency boost learning. Moreover, to increase the diversity of synthetic data and ensure quality, we proposed to use an evolution decoding algorithm. Compared with the original back-translation, our method can generate more diverse while less noisy synthetic sentences. The experimental results show that the proposed method can get 0.6 BLEU improvements on the WMT18 EN-DE news dataset and more than 0.4 BLEU improvements on the EN-ZH dataset which is in the medical field, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available