4.6 Article

Breast cancer diagnosis through knowledge distillation of Swin transformer-based teacher-student models

Journal

Publisher

IOP Publishing Ltd
DOI: 10.1088/2632-2153/ad10cc

Keywords

teacher model; student model; Swin-transformers; transfer learning; knowledge distillation; breast cancer histopathology

Ask authors/readers for more resources

This article introduces two Swin transformer-based models, the teacher model and the student model, for diagnosing breast cancer. The models are trained using transfer learning and knowledge distillation, and the SARSA algorithm is used to improve accuracy and training efficiency. The student model shows promising performance in WSI analysis.
Breast cancer is a significant global health concern, emphasizing the crucial need for a timely and accurate diagnosis to enhance survival rates. Traditional diagnostic methods rely on pathologists analyzing whole-slide images (WSIs) to identify and diagnose malignancies. However, this task is complex, demanding specialized expertise and imposing a substantial workload on pathologists. Additionally, existing deep learning models, commonly employed for classifying histopathology images, often need enhancements to ensure their suitability for real-time deployment on WSI, especially when trained for small regions of interest (ROIs). This article introduces two Swin transformer-based architectures: the teacher model, characterized by its moderate size, and the lightweight student model. Both models are trained using a publicly available dataset of breast cancer histopathology images, focusing on ROIs with varying magnification factors. Transfer learning is applied to train the teacher model, and knowledge distillation (KD) transfers its capabilities to the student model. To enhance validation accuracy and minimize the total loss in KD, we employ the state-action-reward-state-action (SARSA) reinforcement learning algorithm. The algorithm dynamically computes temperature and a weighting factor throughout the KD process to achieve high accuracy within a considerably shorter training timeframe. Additionally, the student model is deployed to analyze malignancies in WSI. Despite the student model being only one-third the size and flops of the teacher model, it achieves an impressive accuracy of 98.71%, slightly below the teacher's accuracy of 98.91%. Experimental results demonstrate that the student model can process WSIs at a throughput of 1.67 samples s-1 with an accuracy of 82%. The proposed student model, trained using KD and the SARSA algorithm, exhibits promising breast cancer classification and WSI analysis performance. These findings indicate its potential for assisting pathologists in diagnosing breast cancer accurately and effectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available