4.6 Article

An efficient but effective writer: Diffusion-based semi-autoregressive transformer for automated radiology report generation

Journal

BIOMEDICAL SIGNAL PROCESSING AND CONTROL
Volume 88, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.bspc.2023.105651

Keywords

Radiology report generation; Medical imaging; Image captioning; Clinical automation; Transformer

Ask authors/readers for more resources

This study focuses on the automatic generation of radiology reports to alleviate the burden on doctors. The proposed abnormal semantic diffusion module and length-controllable self attention decoder improve the efficiency and quality of report generation. Additionally, a novel XRG-COVID-19 clinical dataset is tailored for experimental evaluation.
It is firmly believed that manually diagnosing radiology images is clinically critical but labour-intensive and error-prone. Therefore, an automatic radiology report generation method is highly desired for alleviating the burden imposed on doctors. However, a typical report contains numerous template descriptions and only a few abnormal sentences. This unbalanced distribution makes the generation of template sentences more likely. Additionally, describing an entire report in a word-by-word manner can lead to significant latency during the inference step. Besides, the existing datasets are limited to conventional pneumonia, making them incomplete and one-sided. This work is concerned with forming a better trade-off between generation performance. One key design is an abnormal semantic diffusion module, which progressively absorbs the semantics of abnormal medical terminology and strengthens the linguistic coherence between local tokens. In detail, the generated report is refined by enhancing the incorporation of informative words with limited occurrence frequencies, which alleviates the monotony of template-based generation. Another design is a length-controllable self attention decoder, which regulates the input length of the sentences used for target word generation. This framework preserves the autoregressive nature of word generation while also maintaining a controllable range, ensuring the efficiency of report generation. Moreover, a novel XRG-COVID-19 clinical dataset is tailored, which includes X-ray scans and professional diagnostic reports of 8676 patients. The experimental results demonstrate the proposed model achieves a better trade-off between performance and speed than those of carefully designed baselines on both the IU X-ray dataset and the proposed XRG-COVID-19 dataset.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available