4.7 Article

Sequential Transformer via an Outside-In Attention for image captioning

Journal

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2021.104574

Keywords

Image captioning; Self attention; Recurrent network; Transformer

Funding

  1. Scientific research startup project of Karamay campus of China University of Petroleum (Beijing), China [XQZX20200021]
  2. National Natural Science Foundation of China [61673396]
  3. Fundamental Research Funds for the Central Universities, China [18CX02136A]

Ask authors/readers for more resources

This study introduces an Outside-in Attention mechanism to address the limitations of recurrent attention and self attention in image captioning tasks. By incorporating the advantages of both transformer and recurrent network, competitive results are achieved.
Attention-based approaches have been firmly established the state of the art in image captioning tasks. However, both the recurrent attention in recurrent neural network (RNN) and the self attention in transformer have limitations. Recurrent attention only takes the external state to decide where to look, while ignoring to discover the internal relationships between image regions. Self attention is just the opposite. To fill this gap, we firstly introduce an Outside-in Attention that makes the external state participate in the interaction of the image regions. And, it prompts the model to learn the dependency inside the image regions, as well as the dependency between image regions and the external state. Then, we investigate a Sequential Transformer Framework (S-Transformer) based on the original Transformer structure, where the decoder is incorporated with the Outside-in Attention and RNN. This framework can help the model to inherit the advantages of both the transformer and recurrent network in sequence modeling. When tested on COCO dataset, the proposed approaches achieve competitive results in single-model and ensemble configurations on both MSCOCO Karpathy test split and the online test server.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available