☆ 4.7 Article

Vision-knowledge fusion model for multi-domain medical report generation

INFORMATION FUSION (2023)

期刊

INFORMATION FUSION

卷 97, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.inffus.2023.101817

关键词

Medical report generation; Knowledge graph; Multi-modal fusion; Graph neural network

类别

Computer Science, Artificial Intelligence Computer Science, Theory & Methods

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this paper, a vision-knowledge fusion model based on medical images and knowledge graphs is proposed to fully utilize high-quality data from different diseases and languages. The model automatically constructs domain-specific knowledge graphs based on medical standards, fuses image and knowledge using a knowledge-based attention mechanism, and restores fine-grained knowledge through a triples restoration module. Experimental results show that the model outperforms previous benchmark methods and achieves excellent evaluation scores on two different diseases datasets. The interpretability and clinical usefulness of the model are validated, and it can be generalized to multiple domains and different diseases.

Medical report generation with knowledge graph is an essential task in the medical field. Although the existing knowledge graphs have many entities, their semantics are not sufficient due to the challenge of uniformly extracting and fusing the expert knowledge from different diseases. Therefore, it is necessary to automatically construct specific knowledge graph. In this paper, we propose a vision-knowledge fusion model based on medical images and knowledge graphs to fully utilize high-quality data from different diseases and languages. Firstly, we give a general method to automatically construct every domain knowledge graph based on medical standards. Secondly, we design a knowledge-based attention mechanism to effectively fuse image and knowledge. Then, we build a triples restoration module to obtain fine-grained knowledge, and the knowledge-based evaluation metrics are first proposed which are more reasonable and measurable from different dimensions. Finally, we conduct experiments to verify the effectiveness of our model on two different diseases datasets: the IU-Xray chest radiograph public dataset and the NCRC-DS dataset of Chinese dermoscopy reports we compiled. Our model outperforms previous benchmark methods and achieves excellent evaluation scores on both datasets. Additionally, interpretability and clinical usefulness of the model are validated and our method can be generalized to multiple domains and different diseases.

Vision-knowledge fusion model for multi-domain medical report generation

期刊

INFORMATION FUSION

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Vision-knowledge fusion model for multi-domain medical report generation

期刊

INFORMATION FUSION

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文