☆ 4.7 Article

Hybrid transformer UNet for thyroid segmentation from ultrasound scans

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

卷 153, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.compbiomed.2022.106453

关键词

Thyroid gland segmentation; Ultrasound image processing; Deep learning; Attention mechanism; Transformer

类别

Biology Computer Science, Interdisciplinary Applications Engineering, Biomedical Mathematical & Computational Biology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a novel hybrid transformer UNet (H-TUNet) for thyroid gland segmentation in ultrasound sequences. It integrates and refines low-level features from different encoding layers using a designed multi-scale cross-attention transformer (MSCAT) module. It also strengthens contextual features from successive frames using a 3D self-attention transformer (SAT) module. Experimental results on TSUD and TG3k datasets demonstrate the superiority of the proposed method in thyroid gland segmentation.

Deep learning based medical image segmentation methods have been widely used for thyroid gland segmen-tation from ultrasound images, which is of great importance for the diagnosis of thyroid disease since it can provide various valuable sonography features. However, existing thyroid gland segmentation models suffer from: (1) low-level features that are significant in depicting thyroid boundaries are gradually lost during the feature encoding process, (2) contextual features reflecting the changes of difference between thyroid and other anatomies in the ultrasound diagnosis process are either omitted by 2D convolutions or weakly represented by 3D convolutions due to high redundancy. In this work, we propose a novel hybrid transformer UNet (H-TUNet) to segment thyroid glands in ultrasound sequences, which consists of two parts: (1) a 2D Transformer UNet is proposed by utilizing a designed multi-scale cross-attention transformer (MSCAT) module on every skipped connection of the UNet, so that the low-level features from different encoding layers are integrated and refined according to the high-level features in the decoding scheme, leading to better representation of differences between anatomies in one ultrasound frame; (2) a 3D Transformer UNet is proposed by applying a 3D self-attention transformer (SAT) module to the very bottom layer of 3D UNet, so that the contextual features representing visual differences between regions and consistencies within regions could be strengthened from successive frames in the video. The learning process of the H-TUNet is formulated as a unified end-to-end network, so the intra-frame feature extraction and inter-frame feature aggregation can be learned and optimized jointly. The proposed method was evaluated on Thyroid Segmentation in Ultrasonography Dataset (TSUD) and TG3k Dataset. Experimental results have demonstrated that our method outperformed other state-of-the-art methods with respect to the certain benchmarks for thyroid gland segmentation.

Hybrid transformer UNet for thyroid segmentation from ultrasound scans

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Hybrid transformer UNet for thyroid segmentation from ultrasound scans

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文