4.5 Article

Hierarchical Deep Neural Network for Image Captioning

期刊

NEURAL PROCESSING LETTERS
卷 52, 期 2, 页码 1057-1067

出版社

SPRINGER
DOI: 10.1007/s11063-019-09997-5

关键词

Regional semantic; Image captioning; Attention mechanism

向作者/读者索取更多资源

Automatically describing image content with natural language is a fundamental challenge for computer vision community. General methods used visual information to generate sentences directly. However, only depending on the visual information is not enough to generate the fine-grained descriptions for given images. In this paper, we exploit the fusion of visual information and high-level semantic information for image captioning. We propose a hierarchical deep neural network, which consists of the bottom layer and the top layer. The former extracts the visual and high-level semantic information from image and detected regions, respectively, while the latter integrates both of them with adaptive attention mechanism for the caption generation. The experimental results achieve the competing performances against the state-of-the-art methods on MSCOCO dataset.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据