☆ 4.6 Article

Visual question answering by pattern matching and reasoning

NEUROCOMPUTING (2022)

期刊

NEUROCOMPUTING

卷 467, 期 -, 页码 323-336

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2021.10.016

关键词

Visual question answering; Reinforcement learning; Inference; Pattern matching

类别

Computer Science, Artificial Intelligence

资金

National Key Research and Devel-opment Program of China [2017YFA0700800]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The method proposed in this paper utilizes key features such as entity-attribute graphs, query graphs, reinforcement learning models, and inference schemes to efficiently process visual tasks and accurately answer questions.

Traditional techniques for visual question answering (vQA) are mostly end-to-end neural network based, which often perform poorly (e.g., low accuracy) due to lack of understanding and reasoning. To overcome the weaknesses, we propose a comprehensive approach with following key features. (1) It represents inputs, i.e., an image f and a natural language question Q(nl) as an entity-attribute graph and a query graph, respectively, and employs pattern matching to find answers; (2) it leverages reinforcement learning based model to identify a set of policies that are used to guide visual tasks and construct an entity-attribute graph, based on Q(nl); (3) it employs a novel method to parse a question Q(nl) and generate corresponding query graph Q(u(o)) for pattern matching; and (4) it integrates inference scheme to further improve result accuracy, in particular, it learns a graph-structured classifier for missing value inference and a co-occurrence matrix for candidate selection. With these features, our approach can not only process visual tasks efficiently, but also answer questions with high accuracy. To evaluate the performance of our approach, we conduct empirical studies on Soccer, Visual-Genome and GQA, and show that our approach outperforms the state-of-the-art methods in result accuracy and system efficiency. (C) 2021 Elsevier B.V. All rights reserved.

Visual question answering by pattern matching and reasoning

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Visual question answering by pattern matching and reasoning

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文