4.6 Article

Visual question answering by pattern matching and reasoning

期刊

NEUROCOMPUTING
卷 467, 期 -, 页码 323-336

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2021.10.016

关键词

Visual question answering; Reinforcement learning; Inference; Pattern matching

资金

  1. National Key Research and Devel-opment Program of China [2017YFA0700800]

向作者/读者索取更多资源

The method proposed in this paper utilizes key features such as entity-attribute graphs, query graphs, reinforcement learning models, and inference schemes to efficiently process visual tasks and accurately answer questions.
Traditional techniques for visual question answering (vQA) are mostly end-to-end neural network based, which often perform poorly (e.g., low accuracy) due to lack of understanding and reasoning. To overcome the weaknesses, we propose a comprehensive approach with following key features. (1) It represents inputs, i.e., an image f and a natural language question Q(nl) as an entity-attribute graph and a query graph, respectively, and employs pattern matching to find answers; (2) it leverages reinforcement learning based model to identify a set of policies that are used to guide visual tasks and construct an entity-attribute graph, based on Q(nl); (3) it employs a novel method to parse a question Q(nl) and generate corresponding query graph Q(u(o)) for pattern matching; and (4) it integrates inference scheme to further improve result accuracy, in particular, it learns a graph-structured classifier for missing value inference and a co-occurrence matrix for candidate selection. With these features, our approach can not only process visual tasks efficiently, but also answer questions with high accuracy. To evaluate the performance of our approach, we conduct empirical studies on Soccer, Visual-Genome and GQA, and show that our approach outperforms the state-of-the-art methods in result accuracy and system efficiency. (C) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据