☆ 4.6 Article

Confidence-based interactable neural-symbolic visual question answering

NEUROCOMPUTING (2024)

期刊

NEUROCOMPUTING

卷 564, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2023.126991

关键词

Confidence-based neural-symbolic methods; Interactable neural-symbolic methods; Visual question answering

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Visual question answering requires processing multi-modal information and effective reasoning. Neural-symbolic learning is a promising method, but current approaches lack uncertainty handling and can only provide a single answer. To address this, we propose a confidence based neural-symbolic approach that evaluates NN inferences and conducts reasoning based on confidence.

Visual question answering (VQA) task demands proficiency in processing multi-modal information, and the ability to reason effectively using the information. One promising method for this task is neural-symbolic (NS) learning, which leverages the strengths of both neural network (NN) learning and symbolic reasoning to achieve efficient VQA. However, current NS approaches do not account for the uncertain nature of NN learning and can only provide a single answer to a question without any indication of its confidence, thereby limiting their ability to handle incorrect reasoning. To address this limitation, we propose a confidence based neural-symbolic (CBNS) approach, which evaluates the confidence of the NN inferences based on uncertainty quantification and makes confidence-based reasoning. The proposed approach comprises three main components: (1) a probabilistic question parser that generates multiple program candidates, each with a corresponding confidence evaluation; (2) a probabilistic scene perception module that provides object-based scene representation and confidence evaluations for each attribute of objects in an image; and (3) a confidence based program executor that provides answers with confidence evaluations throughout the inference process by leveraging the confidence evaluations of the scene representation and programs. Additionally, we present a data augmentation method to improve the training efficiency of NS learning. The proposed approach allows user interactions and feedback on the weak links based on confidence evaluations. Experiments on CLEVR and GQA datasets demonstrate that the proposed approach was effective in identifying the correctness of predictions and led to a promising performance improvement with a significantly reduced computation cost.

Confidence-based interactable neural-symbolic visual question answering

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Confidence-based interactable neural-symbolic visual question answering

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文