☆ 4.7 Article

MuSiC-ViT: A multi-task Siamese convolutional vision transformer for differentiating change from no-change in follow-up chest radiographs

MEDICAL IMAGE ANALYSIS (2023)

Journal

MEDICAL IMAGE ANALYSIS

Volume 89, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.media.2023.102894

Keywords

CNNs meet vision transformers; Follow-up chest radiographs; Multi-task learning; Vision transformer; Siamese network

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

A major responsibility of radiologists is to read follow-up chest radiographs and differentiate meaningful changes from natural or benign variations. This study proposes the use of a multi-task Siamese convolutional vision transformer with an anatomy-matching module to mimic the radiologist's cognitive process.

A major responsibility of radiologists in routine clinical practice is to read follow-up chest radiographs (CXRs) to identify changes in a patient's condition. Diagnosing meaningful changes in follow-up CXRs is challenging because radiologists must differentiate disease changes from natural or benign variations. Here, we suggest using a multi-task Siamese convolutional vision transformer (MuSiC-ViT) with an anatomy-matching module (AMM) to mimic the radiologist's cognitive process for differentiating baseline change from no-change. MuSiC-ViT uses the convolutional neural networks (CNNs) meet vision transformers model that combines CNN and transformer architecture. It has three major components: a Siamese network architecture, an AMM, and multi-task learning. Because the input is a pair of CXRs, a Siamese network was adopted for the encoder. The AMM is an attention module that focuses on related regions in the CXR pairs. To mimic a radiologist's cognitive process, MuSiC-ViT was trained using multi-task learning, normal/abnormal and change/no-change classification, and anatomymatching. Among 406 K CXRs studied, 88 K change and 115 K no-change pairs were acquired for the training dataset. The internal validation dataset consisted of 1,620 pairs. To demonstrate the robustness of MuSiC-ViT, we verified the results with two other validation datasets. MuSiC-ViT respectively achieved accuracies and area under the receiver operating characteristic curves of 0.728 and 0.797 on the internal validation dataset, 0.614 and 0.784 on the first external validation dataset, and 0.745 and 0.858 on a second temporally separated validation dataset. All code is available at https://github.com/chokyungjin/MuSiC-ViT.

MuSiC-ViT: A multi-task Siamese convolutional vision transformer for differentiating change from no-change in follow-up chest radiographs

Journal

MEDICAL IMAGE ANALYSIS

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

MuSiC-ViT: A multi-task Siamese convolutional vision transformer for differentiating change from no-change in follow-up chest radiographs

Journal

MEDICAL IMAGE ANALYSIS

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper