☆ 4.5 Article

3D multi-scale vision transformer for lung nodule detection in chest CT images

SIGNAL IMAGE AND VIDEO PROCESSING (2023)

Journal

SIGNAL IMAGE AND VIDEO PROCESSING

Volume 17, Issue 5, Pages 2473-2480

Publisher

SPRINGER LONDON LTD

DOI: 10.1007/s11760-022-02464-0

Keywords

Computer-aided diagnosis; Computed tomography; Vision transformer; Lung nodule; 3D-MSViT

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Lung cancer is the leading cause of cancer-related death, and radiologists use computed tomography (CT) to diagnose lung nodules. Manual analysis of hundreds of CT images by radiologists is burdensome and sometimes inaccurate. This study proposes a CAD scheme based on 3D multi-scale vision transformer (3D-MSViT) to enhance feature extraction and improve lung nodule prediction efficiency.

Lung cancer becomes the most prominent cause of cancer-related death in society. Normally, radiologists use computed tomography (CT) to diagnose lung nodules in lung cancer patients. A single CT scan for a patient produces hundreds of images that are manually analyzed by radiologists which is a big burden and sometimes leads to inaccuracy. Recently, many computer-aided diagnosis (CAD) systems integrated with deep learning architectures have been proposed to assist radiologists. This study proposes the CAD scheme based on a 3D multi-scale vision transformer (3D-MSViT) to enhance multi-scale feature extraction and improves lung nodule prediction efficiency from 3D CT images. The 3D-MSViT architecture adopted a local-global transformer block structure whereby the local transformer stage individually processes each scale patch and forwards it to the global transformer level for merging multi-scale features. The transformer blocks fully relied on the attention mechanism without the inclusion of the convolutional neural network to reduce the network parameters. The proposed CAD scheme was validated on 888 CT images of the Lung Nodule Analysis 2016 (LUNA16) public dataset. Free-response receiver operating characteristics analysis was adopted to evaluate the proposed method. The 3D-MSViT algorithm obtained the highest sensitivity of 97.81% and competition performance metrics of 0.911. Therefore, the 3D-MSViT scheme obtained comparable results with low network complexity related to the counterpart deep learning approaches in prior studies.

3D multi-scale vision transformer for lung nodule detection in chest CT images

Journal

SIGNAL IMAGE AND VIDEO PROCESSING

Publisher

SPRINGER LONDON LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

3D multi-scale vision transformer for lung nodule detection in chest CT images

Journal

SIGNAL IMAGE AND VIDEO PROCESSING

Publisher

SPRINGER LONDON LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper