4.7 Article

ViT-P: Classification of Genitourinary Syndrome of Menopause From OCT Images Based on Vision Transformer Models

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIM.2021.3122121

Keywords

Deep learning; genitourinary syndrome of menopause (GSM); image classification; optical coherence tomography (OCT); vision transformer (ViT)

Funding

  1. Science and Technology Department of Jilin Province of China [20210204193YY]
  2. National Key Research and Development Project of China [2019YFC0409105]
  3. Zhongshan Science and Technology Bureau [2018B1021]
  4. Department of Education of Guangdong Province [2018KQNCX332]
  5. University of Electronic Science and Technology of China, Zhongshan Institute [418YKQN08]

Ask authors/readers for more resources

This study introduces the vision transformer (ViT) to medical OCT images for the first time and proposes a deep learning-based approach for GSM lesion screening. By building a GSM dataset and experimental model, it aims to address practical issues and improve classification accuracy in OCT images, reducing the workload of gynecologists.
Genitourinary syndrome of menopause (GSM) is a disease caused by a physiological decline in estrogen levels, and it can negatively affect a woman's overall health and quality of life in terms of sexual function. Real-time optical biopsy images can now be obtained with optical coherence tomography (OCT) systems. In this study, we introduce vision transformer (ViT) to the field of medical OCT images for the first time and propose a deep learning-based approach for GSM lesion screening. Specifically, we first build a GSM dataset to train and evaluate the experimental model performance. The study aims to propose a method that combines null convolution with a deep convolutional adversarial generative network classifier to generate the samples needed for training to alleviate the hindrance of such problems, in response to certain practical problems, such as category imbalance that occur during data collection. Next, the experiments present ViT PLUS (ViT-P) for the vaginal OCT image classification task used, which effectively improves the shortcomings of ViT in extracting Patch Embedding using a multibranch convolutional neural network combined with a channel attention mechanism. The clinical images acquired by the OCT device are then automatically classified on the basis of the OCT device to reduce the medical workload of gynecologists. Experimental results show that the ViT-P model outperforms the CNN model and ViT for case screening in the GSM and UCSD datasets, and the accuracy can reach 99.9% and 99.69%, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available