3.9 Article

Image Captioning using Reinforcement Learning with BLUDEr Optimization

Journal

PATTERN RECOGNITION AND IMAGE ANALYSIS
Volume 30, Issue 4, Pages 607-613

Publisher

SPRINGERNATURE
DOI: 10.1134/S1054661820040094

Keywords

image captioning; reinforcement learning; BLUDEr; self-critical sequence training; policy-gradient; deep learning; ResNet; long-short term memory; attention; BLEU; CIDEr

Funding

  1. 'Centre for Data Sciences and Applied Machine Learning' (CDSAML) of PES University

Ask authors/readers for more resources

Image captioning is a growing field of research that has taken hold of the research community. It is a challenging task owing to the complexity of natural language generation and the difficulty involved in feature extraction from a diverse collection of images. Many models have been proposed to tackle the problem, like state-of-the-art encoder-decoder (Sequential CNN-RNN) systems that have proved to be capable of obtaining results. Recently, Reinforcement learning has made itself the new approach to the problem and has been successful in surpassing many of the state-of-the-art paradigms. We have come up with a new reward system known as the BLUDEr metric, which is a linear combination of the non-differentiable metrics BLEU and CIDEr. We directly optimize this metric for our model, on natural language generation tasks. In our experiments, we use the Flickr30k and Flickr8k datasets, which have become two of the benchmark datasets when it comes to image captioning systems. We have achieved state-of-the-art results on these two datasets, when compared with other models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.9
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available