4.5 Article

TD-Net:unsupervised medical image registration network based on Transformer and CNN

期刊

APPLIED INTELLIGENCE
卷 52, 期 15, 页码 18201-18209

出版社

SPRINGER
DOI: 10.1007/s10489-022-03472-w

关键词

Deformable image registration; Deep learning; CNN; Transformer

资金

  1. National Nature Science Foundation of China [61772226, 61862056]
  2. Science and Technology Development Program of Jilin Province [20210204133YY]
  3. Natural Science Foundation of Jilin Province [20200201159JC]
  4. Key Laboratory for Symbol Computation and Knowledge Engineering of the National Education Ministry of China
  5. Jilin University

向作者/读者索取更多资源

Medical image registration is a crucial task in computer-aided medical diagnosis, and recent research has focused on using deep learning methods to improve its accuracy. This paper introduces the use of Transformer, a powerful global modeling tool, to enhance medical image registration. By combining Transformer with CNN in a hybrid network, the proposed method is able to extract both local and global information, resulting in improved accuracy in brain MRI scans compared to state-of-the-art approaches.
Medical image registration is a fundamental task in computer-aided medical diagnosis. Recently, researchers have begun to use deep learning methods based on convolutional neural networks (CNN) for registration, and have made remarkable achievements in medical image registration. Although CNN based methods can provide rich local information on registration, their global modeling ability is weak to carry out the long distance information interaction and restrict the registration performance. The Transformer is originally used for sequence-to-sequence prediction. Now it also achieves great results in various visual tasks, due to its strong global modeling capability. Compared with CNN, Transformer can provide rich global information, in contrast, Transformer lacks of local information. To address Transformer lacks local information, we propose a hybrid network which is similar to U-Net to combine Transformer and CNN, to extract global and local information (at each level). Specifically, CNN is first used to obtain the feature maps of the image, and the Transformer is used as encoder to extract global information. Then the results obtained by Transformer encoding are connected to the upsampling process. The upsampling uses CNN to integrate local information and global information. Finally, the resolution is restored to the input image, and obtain the displacement field after several convolution layers. We evaluate our method on brain MRI scans. Experimental results demonstrate that our method improves the accuracy by 1% compared with the state-of-the-art approaches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据