4.6 Article

Confidence-based interactable neural-symbolic visual question answering

期刊

NEUROCOMPUTING
卷 564, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2023.126991

关键词

Confidence-based neural-symbolic methods; Interactable neural-symbolic methods; Visual question answering

向作者/读者索取更多资源

Visual question answering requires processing multi-modal information and effective reasoning. Neural-symbolic learning is a promising method, but current approaches lack uncertainty handling and can only provide a single answer. To address this, we propose a confidence based neural-symbolic approach that evaluates NN inferences and conducts reasoning based on confidence.
Visual question answering (VQA) task demands proficiency in processing multi-modal information, and the ability to reason effectively using the information. One promising method for this task is neural-symbolic (NS) learning, which leverages the strengths of both neural network (NN) learning and symbolic reasoning to achieve efficient VQA. However, current NS approaches do not account for the uncertain nature of NN learning and can only provide a single answer to a question without any indication of its confidence, thereby limiting their ability to handle incorrect reasoning. To address this limitation, we propose a confidence based neural-symbolic (CBNS) approach, which evaluates the confidence of the NN inferences based on uncertainty quantification and makes confidence-based reasoning. The proposed approach comprises three main components: (1) a probabilistic question parser that generates multiple program candidates, each with a corresponding confidence evaluation; (2) a probabilistic scene perception module that provides object-based scene representation and confidence evaluations for each attribute of objects in an image; and (3) a confidence based program executor that provides answers with confidence evaluations throughout the inference process by leveraging the confidence evaluations of the scene representation and programs. Additionally, we present a data augmentation method to improve the training efficiency of NS learning. The proposed approach allows user interactions and feedback on the weak links based on confidence evaluations. Experiments on CLEVR and GQA datasets demonstrate that the proposed approach was effective in identifying the correctness of predictions and led to a promising performance improvement with a significantly reduced computation cost.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据