4.7 Article

Hybrid-scale contextual fusion network for medical image segmentation

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 152, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2022.106439

Keywords

Convolutional neural networks; Transformer; Medical image segmentation

Ask authors/readers for more resources

Medical image segmentation results are crucial for disease diagnosis. With the development of convolutional neural networks, medical image processing has made significant progress. However, existing automatic segmentation tasks still face challenges due to variations in position, size, and shape, resulting in poor performance. To address this, we propose a hybrid-scale contextual fusion network to capture richer spatial and semantic information.
Medical image segmentation result is an essential reference for disease diagnosis. Recently, with the development and application of convolutional neural networks, medical image processing has significantly developed. However, most existing automatic segmentation tasks are still challenging due to various positions, sizes, and shapes, resulting in poor segmentation performance. In addition, most of the current methods use the encoder- decoder architecture for feature extraction, focusing on the acquisition of semantic information but ignoring the specific target and global context information. In this work, we propose a hybrid-scale contextual fusion network to capture the richer spatial and semantic information. First, a hybrid-scale embedding layer (HEL) is employed before the transformer. By mixing each embedding with multiple patches, the object information of different scales can be captured availably. Further, we present a standard transformer to model long-range dependencies in the first two skip connections. Meanwhile, the pooling transformer (PTrans) is employed to handle long input sequences in the following two skip connections. By leveraging the global average pooling operation and the corresponding transformer block, the spatial structure information of the target will be learned effectively. In the last, dual-branch channel attention module (DCA) is proposed to focus on crucial channel features and conduct multi-level features fusion simultaneously. By utilizing the fusion scheme, richer context and fine-grained features are captured and encoded efficiently. Extensive experiments on three public datasets demonstrate that the proposed method outperforms state-of-the-art methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available