4.6 Article

Stimulus-driven and concept-driven analysis for image caption generation

Journal

NEUROCOMPUTING
Volume 398, Issue -, Pages 520-530

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2019.04.095

Keywords

Image captioning; Stimulus-driven; Concept-driven; Attention mechanism; LSTM

Funding

  1. Open Project of State Key Laboratory for Novel Software Technology, Nanjing University, P.R. China [KFKT2019B17]
  2. Fundamental Research Funds for the Central Universities of China [2722019PY052]

Ask authors/readers for more resources

Recently, image captioning has achieved great progress in computer vision and artificial intelligence. However, language models still failed to achieve the desired results in high-level visual tasks. Generating accurate image captions for a complex scene that contains multiple targets is a challenge. To solve these problems, we introduce the theory of attention in psychology to image caption generation. We propose two types of attention mechanisms: The stimulus-driven and the concept-driven. Our attention model relies on a combination of convolutional neural network (CNN) over images and long-short term memory (LSTM) network over sentences. Comparison of experimental results illustrates that our proposed method achieves good performance on the MSCOCO test server. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available