4.6 Article

An encoder-decoder based framework for hindi image caption generation

期刊

MULTIMEDIA TOOLS AND APPLICATIONS
卷 80, 期 28-29, 页码 35721-35740

出版社

SPRINGER
DOI: 10.1007/s11042-021-11106-5

关键词

CNN; VGG16; VGG19; Stacked LSTM; Hindi image caption generation

资金

  1. Scheme for Promotion of Academic and Research Collaboration (SPARC) - Ministry of Education (erstwhile MHRD), Govt of India [P995, SPARC/2018-2019/119/SL]

向作者/读者索取更多资源

This study focuses on generating Hindi image captions using the Hindi Visual genome dataset, with an encoder-decoder architecture incorporating CNN for image encoding and sLSTM for caption generation. Experimental results demonstrate that the proposed model outperforms existing approaches in both qualitative and quantitative aspects.
In recent times, research activity on image caption generation has attracted several researchers. The present work attempt to address the problem of Hindi image caption generation using Hindi Visual genome dataset. Hindi is the official and most spoken language in India. In a linguistically diverse country like India, it is essential to provide a means that can help the people to understand the visual entities in their native languages. In this paper, an encoder-decoder based architecture is proposed where Convolutional Neural Network (CNN) is employed for encoding visual features of an image and stacked Long Short-Term Memory (sLSTM) in combination with both uni-directional LSTM and bi-directional LSTM for generating the captions in Hindi. For encoding the visual feature representation of an image, VGG19 based pre-trained model is used and sLSTM architecture is employed for caption generation at the decoder side. The model is tested over Hindi visual genome dataset to validate the proposed approach's performance and cross-verification is carried out for English captions with Flickr dataset. The experimental results of the proposed approach manifest that the model is qualitatively and quantitatively better than state-of-the-art approaches for Hindi caption generation.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据