☆ 4.6 Article

Reusing Monolingual Pre-Trained Models by Cross-Connecting Seq2seq Models for Machine Translation

APPLIED SCIENCES-BASEL (2021)

期刊

APPLIED SCIENCES-BASEL

卷 11, 期 18, 页码 -

出版社

MDPI

DOI: 10.3390/app11188737

关键词

natural language processing; transfer learning; neural machine translation

类别

Chemistry, Multidisciplinary Engineering, Multidisciplinary Materials Science, Multidisciplinary Physics, Applied

资金

National Research Foundation of Korea(NRF) - Korea government(MSIT) [2018R1A5A7059549, 2020R1A2C1014037]
Institute of Information & communications Technology Planning & Evaluation (IITP) - Korea government (MSIT) (Artificial Intelligence Graduate School Program (Hanyang University)) [2020-0-01373]
National Research Foundation of Korea [2020R1A2C1014037] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This work utilizes sequence-to-sequence models pre-trained on monolingual corpora for machine translation. By combining source and target language models and adding an intermediate layer, it significantly improves translation performance. The study also analyzes the importance of various components in the pre-trained models and their performance changes with the size of bitext.

This work uses sequence-to-sequence (seq2seq) models pre-trained on monolingual corpora for machine translation. We pre-train two seq2seq models with monolingual corpora for the source and target languages, then combine the encoder of the source language model and the decoder of the target language model, i.e., the cross-connection. We add an intermediate layer between the pre-trained encoder and the decoder to help the mapping of each other since the modules are pre-trained completely independently. These monolingual pre-trained models can work as a multilingual pre-trained model because one model can be cross-connected with another model pre-trained on any other language, while their capacity is not affected by the number of languages. We will demonstrate that our method improves the translation performance significantly over the random baseline. Moreover, we will analyze the appropriate choice of the intermediate layer, the importance of each part of a pre-trained model, and the performance change along with the size of the bitext.

Reusing Monolingual Pre-Trained Models by Cross-Connecting Seq2seq Models for Machine Translation

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reusing Monolingual Pre-Trained Models by Cross-Connecting Seq2seq Models for Machine Translation

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文