4.6 Article

Interactions Guided Generative Adversarial Network for unsupervised image captioning

Journal

NEUROCOMPUTING
Volume 417, Issue -, Pages 419-431

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2020.08.019

Keywords

Unsupervised image caption; Object-object interactions; Multi-scale feature

Funding

  1. National Natural Science Foundation of China [62072028, 61772067]
  2. National Key RAMP
  3. D Program of China [2019YFB2204200]

Ask authors/readers for more resources

Most of the current image captioning models that have achieved great successes heavily depend on manually labeled image-caption pairs. However, it is expensive and time-consuming to acquire large scale paired data. In this paper, we propose the Interactions Guided Generative Adversarial Network (IGGAN) for unsupervised image captioning, which joints multi-scale feature representation and object-object interactions. To get robust feature representation, the image is encoded by ResNet with a new Multi-scale module and adaptive Channel attention (RMCNet). Moreover, the information on object-object interactions is extracted by our Mutual Attention Network (MAN) and then adopted in the process of adversarial generation, which enhances the rationality of generated sentences. To encourage the sentence to be semantically consistent with the image, we utilize the image and generated sentence to reconstruct each other by cycle consistency in IGGAN. Our proposed model can generate sentences without any manually labeled image-caption pairs. Experimental results show that our proposed model achieves quite promising performance on the MSCOCO image captioning dataset. The ablation studies validate the effectiveness of our proposed modules. (C) 2020 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available