☆ 4.7 Article

Deep Learning for Image-to-Text Generation A technical overview

IEEE SIGNAL PROCESSING MAGAZINE (2017)

期刊

IEEE SIGNAL PROCESSING MAGAZINE

卷 34, 期 6, 页码 109-116

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/MSP.2017.2741510

关键词

类别

Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Generating a natural language description from an image is an emerging interdisciplinary problem at the intersection of computer vision, natural language processing, and artificial intelligence ( AI). This task, often referred to as image or visual captioning, forms the technical foundation of many important applications, such as semantic visual search, visual intelligence in chatting robots, photo and video sharing in social media, and aid for visually impaired people to perceive surrounding visual content. Thanks to the recent advances in deep learning, the AI research community has witnessed tremendous progress in visual captioning in recent years. In this article, we will first summarize this exciting emerging visual captioning area. We will then analyze the key development and the major progress the community has made, their impact in both research and industry deployment, and what lies ahead in future breakthroughs.

Deep Learning for Image-to-Text Generation A technical overview

期刊

IEEE SIGNAL PROCESSING MAGAZINE

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Deep Learning for Image-to-Text Generation A technical overview

期刊

IEEE SIGNAL PROCESSING MAGAZINE

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文