☆ 4.6 Article

X-Transformer: A Machine Translation Model Enhanced by the Self-Attention Mechanism

APPLIED SCIENCES-BASEL (2022)

期刊

APPLIED SCIENCES-BASEL

卷 12, 期 9, 页码 -

出版社

MDPI

DOI: 10.3390/app12094502

关键词

machine translation; natural language processing

类别

Chemistry, Multidisciplinary Engineering, Multidisciplinary Materials Science, Multidisciplinary Physics, Applied

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study presents a new machine translation model called X-Transformer, which improves the original Transformer model in terms of parameter compression, encoder structure modification, and decoder model size reduction. The X-Transformer achieves better translation results and significantly reduces training time compared to the Transformer model.

Machine translation has received significant attention in the field of natural language processing not only because of its challenges but also due to the translation needs that arise in the daily life of modern people. In this study, we design a new machine translation model named X-Transformer, which refines the original Transformer model regarding three aspects. First, the model parameter of the encoder is compressed. Second, the encoder structure is modified by adopting two layers of the self-attention mechanism consecutively and reducing the point-wise feed forward layer to help the model understand the semantic structure of sentences precisely. Third, we streamline the decoder model size, while maintaining the accuracy. Through experiments, we demonstrate that having a large number of decoder layers not only affects the performance of the translation model but also increases the inference time. The X-Transformer reaches the state-of-the-art result of 46.63 and 55.63 points in the BiLingual Evaluation Understudy (BLEU) metric of the World Machine Translation (WMT), from 2014, using the English-German and English-French translation corpora, thus outperforming the Transformer model with 19 and 18 BLEU points, respectively. The X-Transformer significantly reduces the training time to only 1/3 times that of the Transformer. In addition, the heat maps of the X-Transformer reach token-level precision (i.e., token-to-token attention), while the Transformer model remains at the sentence level (i.e., token-to-sentence attention).

X-Transformer: A Machine Translation Model Enhanced by the Self-Attention Mechanism

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

X-Transformer: A Machine Translation Model Enhanced by the Self-Attention Mechanism

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文