4.7 Article

ResViT: Residual Vision Transformers for Multimodal Medical Image Synthesis

期刊

IEEE TRANSACTIONS ON MEDICAL IMAGING
卷 41, 期 10, 页码 2598-2614

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMI.2022.3167808

关键词

Transformers; Biomedical imaging; Subspace constraints; Task analysis; Image synthesis; Magnetic resonance imaging; Computer architecture; Medical image synthesis; transformer; residual; vision; adversarial; generative; unified

资金

  1. Scientific and Technological Research Council of Turkey BIDEB Scholarship
  2. Turkish Academy of Sciences GEBIP 2015 Fellowship
  3. Science Academy BAGEP 2017 Fellowship

向作者/读者索取更多资源

This paper proposes a novel generative adversarial approach, ResViT, which combines the contextual sensitivity of vision transformers, the precision of convolution operators, and the realism of adversarial learning. Demonstrations show that ResViT outperforms competing methods based on CNNs and transformers in terms of qualitative observations and quantitative metrics.
Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning. ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant information. A weight sharing strategy is introduced among ART blocks to mitigate computational burden. A unified implementation is introduced to avoid the need to rebuild separate synthesis models for varying source-target modality configurations. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing CNN- and transformer-based methods in terms of qualitative observations and quantitative metrics.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据