☆ 4.7 Article

Medical visual question answering based on question-type reasoning and semantic space constraint

ARTIFICIAL INTELLIGENCE IN MEDICINE (2022)

期刊

ARTIFICIAL INTELLIGENCE IN MEDICINE

卷 131, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.artmed.2022.102346

关键词

Medical visual question answering; Question -type reasoning; Semantic space constraint; Attention mechanism

类别

Computer Science, Artificial Intelligence Engineering, Biomedical Medical Informatics

资金

National Natural Science Foundation of China [61871278]
Chengdu Major Technology Application Demonstration Project [2019-YF09-00120-SN]
Funda- mental Research Funds for the Central Universities [2021SCU12061]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Medical visual question answering (Med-VQA) is a task to accurately answer clinical questions about medical images. This paper proposes a novel Med-VQA framework that addresses the challenges of diverse clinical questions and the relationship between candidate responses. The framework includes a question-type reasoning module, attention mechanism, and semantic constraint space to extract valuable question features and consider the correlation between answers. Experimental results demonstrate improved performance compared to state-of-the-art methods.

Medical visual question answering (Med-VQA) aims to accurately answer clinical questions about medical images. Despite its enormous potential for application in the medical domain, the current technology is still in its infancy. Compared with general visual question answering task, Med-VQA task involve more demanding challenges. First, clinical questions about medical images are usually diverse due to different clinicians and the complexity of diseases. Consequently, noise is inevitably introduced when extracting question features. Second, Med-VQA task have always been regarded as a classification problem for predefined answers, ignoring the relationships between candidate responses. Thus, the Med-VQA model pays equal attention to all candidate answers when predicting answers. In this paper, a novel Med-VQA framework is proposed to alleviate the abovementioned problems. Specifically, we employed a question-type reasoning module severally to closed-ended and open-ended questions, thereby extracting the important information contained in the questions through an attention mechanism and filtering the noise to extract more valuable question features. To take advantage of the relational information between answers, we designed a semantic constraint space to calculate the similarity between the answers and assign higher attention to answers with high correlation. To evaluate the effectiveness of the proposed method, extensive experiments were conducted on a public dataset, namely VQA-RAD. Experimental results showed that the proposed method achieved better performance compared to other the state-ofthe-art methods. The overall accuracy, closed-ended accuracy, and open-ended accuracy reached 74.1 %, 82.7 %, and 60.9 %, respectively. It is worth noting that the absolute accuracy of the proposed method improved by 5.5 % for closed-ended questions.

Medical visual question answering based on question-type reasoning and semantic space constraint

期刊

ARTIFICIAL INTELLIGENCE IN MEDICINE

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Medical visual question answering based on question-type reasoning and semantic space constraint

期刊

ARTIFICIAL INTELLIGENCE IN MEDICINE

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文