☆ 4.7 Article

Automated Radiographic Report Generation Purely on Transformer: A Multicriteria Supervised Approach

IEEE TRANSACTIONS ON MEDICAL IMAGING (2022)

Journal

IEEE TRANSACTIONS ON MEDICAL IMAGING

Volume 41, Issue 10, Pages 2803-2813

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMI.2022.3171661

Keywords

Transformers; Visualization; Medical diagnostic imaging; Feature extraction; Task analysis; Decoding; Training; Medical report generation; image caption; transformer; image-text matching

Funding

National Key Research and Development Program of China [2020AAA0108303]
NSFC [41876098]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a pure transformer-based framework for automated radiographic report generation in the medical field. It addresses the challenges of visual similarity among medical images and the importance of disease-related words. By improving visual-textual alignment, multi-label classification, and word importance weighting, the framework achieves promising performance in generating accurate reports.

Automated radiographic report generation is challenging in at least two aspects. First, medical images are very similar to each other and the visual differences of clinic importance are often fine-grained. Second, the disease-related words may be submerged by many similar sentences describing the common content of the images, causing the abnormal to be misinterpreted as the normal in the worst case. To tackle these challenges, this paper proposes a pure transformer-based framework to jointly enforce better visual-textual alignment, multi-label diagnostic classification, and word importance weighting, to facilitate report generation. To the best of our knowledge, this is the first pure transformer-based framework for medical report generation, which enjoys the capacity of transformer in learning long range dependencies for both image regions and sentence words. Specifically, for the first challenge, we design a novel mechanism to embed an auxiliary image-text matching objective into the transformer's encoder-decoder structure, so that better correlated image and text features could be learned to help a report to discriminate similar images. For the second challenge, we integrate an additional multi-label classification task into our framework to guide the model in making correct diagnostic predictions. Also, a term-weighting scheme is proposed to reflect the importance of words for training so that our model would not miss key discriminative information. Our work achieves promising performance over the state-of-the-arts on two benchmark datasets, including the largest dataset MIMIC-CXR.

Automated Radiographic Report Generation Purely on Transformer: A Multicriteria Supervised Approach

Journal

IEEE TRANSACTIONS ON MEDICAL IMAGING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Automated Radiographic Report Generation Purely on Transformer: A Multicriteria Supervised Approach

Journal

IEEE TRANSACTIONS ON MEDICAL IMAGING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper