☆ 3.8 Proceedings Paper

Cross-Modal Prototype Driven Network for Radiology Report Generation

COMPUTER VISION - ECCV 2022, PT XXXV (2022)

Journal

COMPUTER VISION - ECCV 2022, PT XXXV

Volume 13695, Issue -, Pages 563-579

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG

DOI: 10.1007/978-3-031-19833-5_33

Keywords

Radiology report generation; Cross-modal pattern learning; Prototype learning; Transformers

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a network model called XPRONET, which improves the task of radiology report generation through cross-modal prototype learning and improved multi-label prototype learning. Experimental results demonstrate that XPRONET achieves significant improvements on two benchmarks.

Radiology report generation (RRG) aims to describe automatically a radiology image with human-like language and could potentially support the work of radiologists, reducing the burden of manual reporting. Previous approaches often adopt an encoder-decoder architecture and focus on single-modal feature learning, while few studies explore cross-modal feature interaction. Here we propose a Cross-modal PROtotype driven NETwork (XPRONET) to promote cross-modal pattern learning and exploit it to improve the task of radiology report generation. This is achieved by three well-designed, fully differentiable and complementary modules: a shared cross-modal prototype matrix to record the cross-modal prototypes; a cross-modal prototype network to learn the cross-modal prototypes and embed the cross-modal information into the visual and textual features; and an improved multi-label contrastive loss to enable and enhance multi-label prototype learning. XPRONET obtains substantial improvements on the IU-X ray and MIMIC-CXR benchmarks, where its performance exceeds recent state-of-the-art approaches by a large margin on IU-Xray and comparable performance on MIMIC-CXR (The code is publicly available at https://github.com/Markin-Wang/XProNet.).

Cross-Modal Prototype Driven Network for Radiology Report Generation

Journal

COMPUTER VISION - ECCV 2022, PT XXXV

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Cross-Modal Prototype Driven Network for Radiology Report Generation

Journal

COMPUTER VISION - ECCV 2022, PT XXXV

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper