4.3 Article

Cascade recurrent neural network for image caption generation

期刊

ELECTRONICS LETTERS
卷 53, 期 25, 页码 1642-1643

出版社

INST ENGINEERING TECHNOLOGY-IET
DOI: 10.1049/el.2017.3159

关键词

-

资金

  1. National Natural Science Foundation of China [61673402]
  2. Natural Science Foundation of Guangdong Province [2014A030313173, 2017A030311029]
  3. Science and Technology Program of Guangzhou, China [201704020180]
  4. fundamental Research Funds for the Central Universities of China

向作者/读者索取更多资源

A new cascade recurrent neural network (CRNN) for image caption generation is proposed. Different from the classical multimodal recurrent neural network, which only uses a single network for extracting unidirectional syntactic features, CRNN adopts a cascade network for learning visual-language interactions from forward and backward directions, which can exploit the deep semantic contexts contained in the image. In the proposed framework, two embedding layers for dense word expression are constructed. A new stacked Gated Recurrent Unit is designed for learning image-word mappings. The effectiveness of the CRNN model is verified with adopting the commonly used MSCOCO datasets, where the results indicate CRNN can achieve better performance compared with the state-of-the-art image captioning methods such as Google NIC, multimodal recurrent neural network and so on.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据