☆ 4.8 Article

Neural Machine Translation with Deep Attention

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2020)

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Volume 42, Issue 1, Pages 154-163

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TPAMI.2018.2876404

Keywords

Decoding; Task analysis; Semantics; NIST; Encoding; Neural networks; Analytical models; Deep attention network; neural machine translation (NMT); attention-based sequence-to-sequence learning; natural language processing

Funding

National Natural Science Foundation of China [61672440, 61622209]
Fundamental Research Funds for the Central Universities [ZK1024]
Scientific Research Project of National Language Committee of China [YB135-49]
Baidu Scholarship

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Deepening neural models has been proven very successful in improving the models capacity when solving complex learning tasks, such as the machine translation task. Previous efforts on deep neural machine translation mainly focus on the encoder and the decoder, while little on the attention mechanism. However, the attention mechanism is of vital importance to induce the translation correspondence between different languages where shallow neural networks are relatively insufficient, especially when the encoder and decoder are deep. In this paper, we propose a deep attention model (DeepAtt). Based on the low-level attention information, DeepAtt is capable of automatically determining what should be passed or suppressed from the corresponding encoder layer so as to make the distributed representation appropriate for high-level attention and translation. We conduct experiments on NIST Chinese-English, WMT English-German, and WMT English-French translation tasks, where, with five attention layers, DeepAtt yields very competitive performance against the state-of-the-art results. We empirically find that with an adequate increase of attention layers, DeepAtt tends to produce more accurate attention weights. An in-depth analysis on the translation of important context words further reveals that DeepAtt significantly improves the faithfulness of system translations.

Neural Machine Translation with Deep Attention

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Neural Machine Translation with Deep Attention

Journal

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper