☆ 4.6 Article

MATNet: Exploiting Multi-Modal Features for Radiology Report Generation

IEEE SIGNAL PROCESSING LETTERS (2022)

期刊

IEEE SIGNAL PROCESSING LETTERS

卷 29, 期 -, 页码 2692-2696

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/LSP.2022.3229844

关键词

Diseases; Transformers; Decoding; Visualization; Feature extraction; Radiology; MIMICs; Radiology report generation; multi-modal learning; medical image processing

类别

Engineering, Electrical & Electronic

资金

National Natural Science Foundation of China [62003065]
Natural Science Foundation Project of Chongqing Science and Technology Bureau [CSTB2022NSCQ-MSX1206, CSTB2022TFII-OFX0042, cstc2019jscx-mbdxX0061]
Key Science and Technology Research Program of Chongqing Municipal Education Commission [KJZD-K202200510]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Medical imaging plays an important role in clinical workflows, but automatically generating radiology reports faces challenges such as data biases and decoder limitations. To address these issues, we propose Multi-modal Adaptive Transformer (MATNet), a system that combines natural language processing and machine learning techniques to create fluent and accurate radiology reports.

Medical imaging is widely used in hospital clinical workflows. Assisting physicians in diagnosis by automatically generating reports from radiological images is an unmet clinical demand and requires urgent attention. However, this task suffers from two significant problems: 1) visual and textual data biases, and 2) the Transformer decoder makes no distinction between visual and non-visual words. We propose a novel multi-task approach combining natural language processing with machine learning techniques to meet this clinical need, i.e., creating fluent and accurate radiology reports. We name our system as Multi-modal Adaptive Transformer (MATNet), which consists of three key modules. First, Multi-Modal Encoder (MME) explores the relationship between radiology images and clinical notes. Second, Disease Classifier (DC) classifies the states of each disease topic and provides state-aware disease embeddings to alleviate visual data bias. Last, Adaptive Decoder (AD) dynamically measures the contribution of source signals and target signals when generating the next word. Based on our evaluations using benchmark IU-XRay and MIMIC-CXR datasets, the proposed MATNet outperformed previous state-of-the-art models on language fluency and clinical accuracy metrics such as BLEU scores.

MATNet: Exploiting Multi-Modal Features for Radiology Report Generation

期刊

IEEE SIGNAL PROCESSING LETTERS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

MATNet: Exploiting Multi-Modal Features for Radiology Report Generation

期刊

IEEE SIGNAL PROCESSING LETTERS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文