☆ 4.5 Article

End-to-end deep learning approach for Parkinson?s disease detection from speech signals

BIOCYBERNETICS AND BIOMEDICAL ENGINEERING (2022)

期刊

BIOCYBERNETICS AND BIOMEDICAL ENGINEERING

卷 42, 期 2, 页码 556-574

出版社

ELSEVIER

DOI: 10.1016/j.bbe.2022.04.002

关键词

Parkinson?s disease; Deep learning; End-to-end; Speech disorder; Feature visualization

类别

Engineering, Biomedical

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a novel deep learning model for Parkinson's disease detection from speech signals. The model extracts time series dynamic features using 2D-CNNs and captures the dependencies between them using 1D-CNN. The proposed model outperformed other machine learning models and achieved high accuracies on speech tasks in different languages. The features generated by the model were able to capture the characteristics of Parkinson's disease sounds, such as reduced overall frequency range and variability. The low-frequency region of the Mel-spectrogram was found to be more influential and important for Parkinson's disease detection from speech.

More than 90% of patients with Parkinson's disease suffer from hypokinetic dysarthria. This paper proposes a novel end-to-end deep learning model for Parkinson's disease detec-tion from speech signals. The proposed model extracts time series dynamic features using time-distributed two-dimensional convolutional neural networks (2D-CNNs), and then captures the dependencies between these time series using a one-dimensional CNN (1D-CNN). The performance of the proposed model was verified on two databases. On Database-1, the proposed model outperformed expert features-based machine learning models and achieved promising results, showing accuracies of 81.6% on the speech task of sustained vowel /a/ and 75.3% on the speech task of reading a short sentence (/si shi si zhi shi shi zi/) in Chinese. On Database-2, the proposed model was assessed on multiple sound types, including vowels, words, and sentences. An accuracy of up to 92% was obtained on the speech tasks, which included reading simple (/loslibros/) and complex (/ viste/) sentences in Spanish. By visualizing the features generated by the model, it was found that the learned time series dynamic features are able to capture the characteristics of the reduced overall frequency range and reduced variability of Parkinson's disease sounds, which are important clinical evidence for detecting Parkinson's disease patients. The results also suggest that the low-frequency region of the Mel-spectrogram is more influential and important than the high-frequency region for Parkinson's disease detection from speech. (c) 2022 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences. Published by Elsevier B.V. All rights reserved.

End-to-end deep learning approach for Parkinson?s disease detection from speech signals

期刊

BIOCYBERNETICS AND BIOMEDICAL ENGINEERING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

End-to-end deep learning approach for Parkinson?s disease detection from speech signals

期刊

BIOCYBERNETICS AND BIOMEDICAL ENGINEERING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文