期刊
APPLIED SCIENCES-BASEL
卷 11, 期 18, 页码 -出版社
MDPI
DOI: 10.3390/app11188737
关键词
natural language processing; transfer learning; neural machine translation
类别
资金
- National Research Foundation of Korea(NRF) - Korea government(MSIT) [2018R1A5A7059549, 2020R1A2C1014037]
- Institute of Information & communications Technology Planning & Evaluation (IITP) - Korea government (MSIT) (Artificial Intelligence Graduate School Program (Hanyang University)) [2020-0-01373]
- National Research Foundation of Korea [2020R1A2C1014037] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)
This work utilizes sequence-to-sequence models pre-trained on monolingual corpora for machine translation. By combining source and target language models and adding an intermediate layer, it significantly improves translation performance. The study also analyzes the importance of various components in the pre-trained models and their performance changes with the size of bitext.
This work uses sequence-to-sequence (seq2seq) models pre-trained on monolingual corpora for machine translation. We pre-train two seq2seq models with monolingual corpora for the source and target languages, then combine the encoder of the source language model and the decoder of the target language model, i.e., the cross-connection. We add an intermediate layer between the pre-trained encoder and the decoder to help the mapping of each other since the modules are pre-trained completely independently. These monolingual pre-trained models can work as a multilingual pre-trained model because one model can be cross-connected with another model pre-trained on any other language, while their capacity is not affected by the number of languages. We will demonstrate that our method improves the translation performance significantly over the random baseline. Moreover, we will analyze the appropriate choice of the intermediate layer, the importance of each part of a pre-trained model, and the performance change along with the size of the bitext.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据