4.7 Article

Spine-transformers: Vertebra labeling and segmentation in arbitrary field-of-view spine CTs via 3D transformers

期刊

MEDICAL IMAGE ANALYSIS
卷 75, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.media.2021.102258

关键词

Vertebra labeling and segmentation; Transformers; Arbitrary field-of-view spine CT; One-to-one set prediction; Joint regression and segmentation

向作者/读者索取更多资源

This study proposes a two-stage deep learning-based solution for fully automatic labeling and segmentation of 3D vertebrae in arbitrary FOV CT images. The first stage tackles the challenging vertebra labeling problem with a novel transformers-based 3D object detector, while the second stage involves training a single multi-task network for segmentation and refinement of detected centers. The method proves effective through comprehensive experiments on public and in-house datasets.
In this paper, we address the problem of fully automatic labeling and segmentation of 3D vertebrae in arbitrary Field-Of-View (FOV) CT images. We propose a deep learning-based two-stage solution to tackle these two problems. More specifically, in the first stage, the challenging vertebra labeling problem is solved via a novel transformers-based 3D object detector that views automatic detection of vertebrae in arbitrary FOV CT scans as a one-to-one set prediction problem. The main components of the new method, called Spine-Transformers, are a one-to-one set based global loss that forces unique predictions and a light-weighted 3D transformer architecture equipped with a skip connection and learnable positional embeddings for encoder and decoder, respectively. We additionally propose an inscribed sphere based object detector to replace the regular box-based object detector for a better handling of volume orientation variations. Our method reasons about the relationships of different levels of vertebrae and the global volume context to directly infer all vertebrae in parallel. In the second stage, the segmentation of the identified vertebrae and the refinement of the detected centers are then done by training one single multi-task encoder-decoder network for all vertebrae as the network does not need to identify which vertebra it is working on. The two tasks share a common encoder path but with different decoder paths. Comprehensive experiments are conducted on two public datasets and one in-house dataset. The experimental results demonstrate the efficacy of the present approach. (c) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据