4.7 Article

An efficient and robust Phonocardiography (PCG)-based Valvular Heart Diseases (VHD) detection framework using Vision Transformer (ViT)

Journal

COMPUTERS IN BIOLOGY AND MEDICINE
Volume 158, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2023.106734

Keywords

Valvular heart diseases (VHD); Phonocardiography (PCG); Vision Transformer (ViT); D-CNNs; Deep learning (DL); Machine learning (ML)

Ask authors/readers for more resources

This study proposes a new high-performance valvular heart disease (VHD) detection framework based on deep learning, which is relatively simple in terms of network structures but can accurately detect multiple VHDs. Both 1D and 2D PCG signals are used, and nature/bio-inspired algorithms are utilized for feature selection. The results show that the ViT model achieves the best performance among all classifiers, surpassing current state-of-the-art VHD classification models.
Background and objectives: Valvular heart diseases (VHDs) are one of the dominant causes of cardiovascular abnormalities that have been associated with high mortality rates globally. Rapid and accurate diagnosis of the early stage of VHD based on cardiac phonocardiogram (PCG) signal is critical that allows for optimum medication and reduction of mortality rate. Methods: To this end, the current study proposes novel deep learning (DL)-based high-performance VHD detection frameworks that are relatively simpler in terms of network structures, yet effective for accurately detecting multiple VHDs. We present three different frameworks considering both 1D and 2D PCG raw signals. For 1D PCG, Mel frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC) features, whereas, for 2D PCG, various deep convolutional neural networks (D-CNNs) features are extracted. Additionally, nature/bio-inspired algorithms (NIA/BIA) including particle swarm optimization (PSO) and genetic algorithm (GA) have been utilized for automatic and efficient feature selection directly from the raw PCG signal. To further improve the performance of the classifier, vision transformer (ViT) has been implemented levering the self-attention mechanism on the time frequency representation (TFR) of 2D PCG signal. Our extensive study presents a comparative performance analysis and the scope of enhancement for the combination of different descriptors, classifiers, and feature selection algorithms. Main Results: Among all classifiers, ViT provides the best performance by achieving mean average accuracy A(cc) of 99.90 % and F1-score of 99.95 % outperforming current state-of-the-art VHD classification models. Conclusions: The present research provides a robust and efficient DL-based end-to-end PCG signal classification framework for designing a automated high-performance VHD diagnosis system.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available