☆ 4.6 Article Proceedings Paper

Image Understanding using vision and reasoning through Scene Description Graph

COMPUTER VISION AND IMAGE UNDERSTANDING (2018)

期刊

COMPUTER VISION AND IMAGE UNDERSTANDING

卷 173, 期 -, 页码 33-45

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.cviu.2017.12.004

关键词

Image Understanding; Commonsense Reasoning; Vision; Reasoning

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

National Science Foundation [SMA 1540917, CNS 1544797]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Two of the fundamental tasks in image understanding using text are caption generation and visual question answering (Antol et al., 2015; Xiong et al., 2016). This work presents an intermediate knowledge structure that can be used for both tasks to obtain increased interpretability. We call this knowledge structure Scene Description Graph (SDG), as it is a directed labeled graph, representing objects, actions, regions, as well as their attributes, along with inferred concepts and semantic (from KM-Ontology (Clark et al., 2004)), ontological (i.e. superclass, hasProperty), and spatial relations. Thereby a general architecture is proposed in which a system can represent both the content and underlying concepts of an image using an SDG. The architecture is implemented using generic visual recognition techniques and commonsense reasoning to extract graphs from images. The utility of the generated SDGs is demonstrated in the applications of image captioning, image retrieval, and through examples in visual question answering. The experiments in this work show that the extracted graphs capture syntactic and semantic content of images with reasonable accuracy.

Image Understanding using vision and reasoning through Scene Description Graph

期刊

COMPUTER VISION AND IMAGE UNDERSTANDING

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Image Understanding using vision and reasoning through Scene Description Graph

期刊

COMPUTER VISION AND IMAGE UNDERSTANDING

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文