4.8 Article

Neural Machine Translation with Deep Attention

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2018.2876404

Keywords

Decoding; Task analysis; Semantics; NIST; Encoding; Neural networks; Analytical models; Deep attention network; neural machine translation (NMT); attention-based sequence-to-sequence learning; natural language processing

Funding

  1. National Natural Science Foundation of China [61672440, 61622209]
  2. Fundamental Research Funds for the Central Universities [ZK1024]
  3. Scientific Research Project of National Language Committee of China [YB135-49]
  4. Baidu Scholarship

Ask authors/readers for more resources

Deepening neural models has been proven very successful in improving the models capacity when solving complex learning tasks, such as the machine translation task. Previous efforts on deep neural machine translation mainly focus on the encoder and the decoder, while little on the attention mechanism. However, the attention mechanism is of vital importance to induce the translation correspondence between different languages where shallow neural networks are relatively insufficient, especially when the encoder and decoder are deep. In this paper, we propose a deep attention model (DeepAtt). Based on the low-level attention information, DeepAtt is capable of automatically determining what should be passed or suppressed from the corresponding encoder layer so as to make the distributed representation appropriate for high-level attention and translation. We conduct experiments on NIST Chinese-English, WMT English-German, and WMT English-French translation tasks, where, with five attention layers, DeepAtt yields very competitive performance against the state-of-the-art results. We empirically find that with an adequate increase of attention layers, DeepAtt tends to produce more accurate attention weights. An in-depth analysis on the translation of important context words further reveals that DeepAtt significantly improves the faithfulness of system translations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available