4.5 Article

Transformer with progressive sampling for medical cellular image segmentation

Journal

MATHEMATICAL BIOSCIENCES AND ENGINEERING
Volume 19, Issue 12, Pages 12104-12126

Publisher

AMER INST MATHEMATICAL SCIENCES-AIMS
DOI: 10.3934/mbe.2022563

Keywords

medical segmentation; self-attentive mechanism; transformer; strip convolution module; pyramid pooling module

Funding

  1. National Natural Science Foundation of China [61772319, 62002200]
  2. Shandong Natural Science Foundation of China [ZR2021QF134, ZR2021MF068]
  3. Yantai science and technology innovation development plan [2022JCYJ031]

Ask authors/readers for more resources

This paper introduces the use of transformer in medical image segmentation based on convolutional neural networks and proposes a gated position-sensitive axial attention mechanism for small datasets. Iterative sampling is also utilized to update sampling positions and reduce interference from irrelevant regions. Experimental results demonstrate an improvement in segmentation accuracy compared to networks in recent years.
The convolutional neural network, as the backbone network for medical image segmen-tation, has shown good performance in the past years. However, its drawbacks cannot be ignored, namely, convolutional neural networks focus on local regions and are difficult to model global con-textual information. For this reason, transformer, which is used for text processing, was introduced into the field of medical segmentation, and thanks to its expertise in modelling global relationships, the accuracy of medical segmentation was further improved. However, the transformer-based net-work structure requires a certain training set size to achieve satisfactory segmentation results, and most medical segmentation datasets are small in size. Therefore, in this paper we introduce a gated position -sensitive axial attention mechanism in the self-attention module, so that the transformer-based network structure can also be adapted to the case of small datasets. The common operation of the visual trans-former introduced to visual processing when dealing with segmentation tasks is to divide the input image into equal patches of the same size and then perform visual processing on each patch, but this simple division may lead to the destruction of the structure of the original image, and there may be large unimportant regions in the divided grid, causing attention to stay on the uninteresting regions, af-fecting the segmentation performance. Therefore, in this paper, we add iterative sampling to update the sampling positions, so that the attention stays on the region to be segmented, reducing the interference of irrelevant regions and further improving the segmentation performance. In addition, we introduce the strip convolution module (SCM) and pyramid pooling module (PPM) to capture the global contex-tual information. The proposed network is evaluated on several datasets and shows some improvement in segmentation accuracy compared to networks of recent years.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available