4.7 Article

Parkinson's disease and cleft lip and palate of pathological speech diagnosis using deep convolutional neural networks evolved by IPWOA

期刊

APPLIED ACOUSTICS
卷 199, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.apacoust.2022.109003

关键词

Deep CNNs; Pathological speech; Parkinson's disease; Whale optimization algorithm; Cleft lip and palate

资金

  1. Beijing Municipal Natural Science Foundation [4202028]
  2. National Natural Science Foundation of China [62036001]
  3. National Social Science Foundation of China [21BYY106, KYDE40201702]

向作者/读者索取更多资源

This research utilizes deep convolutional neural networks for pathological voice recognition and employs the whale optimization algorithm to select the best network architecture. The proposed model demonstrates superior performance in classifying pathological speech signals.
A high-level abstract idea of speech is created in previously autonomous systems using unsupervised models. Given that these representations are typically acquired by input reconstruction, it cannot be said with assurance that they are resistant to cues unrelated to disease. Pathology diagnosis cannot usually be reliably performed using unsupervised representations. As a result, in this research, we use pathological voice recognition using deep convolutional neural networks (DCNNs). Even though DCNNs have many acknowledged benefits, selecting the best structure for them can be challenging. This work examines the use of the whale optimization algorithm (WOA) to automatically choose the best architecture for DCNNs in an effort to address this constraint. In order to achieve the goal, three canonical WOA-based innovations are suggested. First, a special encoding technique based on Internet Protocol Addresses (IPA) is created to make it easier to encode the DCNN layers with whale vectors. The development of variable-length DCNNs is then suggested using an enfeebled layer that has particular whale vector dimensions. The final step in the learning process involves splitting huge datasets into smaller ones and then randomly reviewing them. Pathological audio signals captured from patients are used to assess the performance of the proposed model. In this regard, five measures were used to conduct thorough research, including ROC and precision-recall curves, F1-Score, sensitivity, specificity, accuracy, and precision. Up to 95.77 percent of the two disordered speech signals are correctly classified by the suggested model, which outperforms the second-best algorithm, VLNSGA-II, by 1.02 percent in terms of accuracy. (C) 2022 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据