4.6 Article

Convergence of Artificial Intelligence and Internet of Things in Smart Healthcare: A Case Study of Voice Pathology Detection

期刊

IEEE ACCESS
卷 9, 期 -, 页码 89198-89209

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3090317

关键词

Pathology; Medical services; Artificial intelligence; Speech recognition; Paralysis; Internet of Things; Larynx; Smart healthcare; deep learning; convolutional neural network (CNN); long short-term memory (LSTM); voice pathology detection

资金

  1. Deanship of Scientific Research at King Saud University, Riyadh, Saudi Arabia [RG-1436-016]

向作者/读者索取更多资源

The integration of artificial intelligence (AI) and the Internet of Things (IoT) has great potential in smart healthcare, especially post-COVID-19. A voice pathology detection system within a smart healthcare framework is proposed, utilizing deep learning and neural networks technology to improve accuracy by obtaining signals through microphones and electroglottography devices.
The integration of artificial intelligence (AI) and the Internet of Things (IoT) has tremendous prospects in smart healthcare. The advancement of AI in the form of deep learning brought a revolution in automatic classification and detection systems. In addition, next-generation wireless communications such as 5G networking brought speed and the seamless transmission of data. With the convergence of these elements, the smart healthcare sector is currently booming. Particularly during the post-COVID-19 pandemic, the necessity of smart healthcare has come to light more than before. A significant number of people suffer from voice pathology. This pathology can be easily cured if detected early. In this study, a voice pathology detection system within a smart healthcare framework is proposed. The inputs are obtained by the IoT, namely microphones and electroglottography (EGG) devices to capture voice and EGG signals, respectively. Spectrograms are obtained from these signals and fed into a pretrained convolutional neural network (CNN). The features extracted from the CNN are fused and processed using a bi-directional long short-term memory network. The proposed system is evaluated using a publicly available database, called the Saarbruecken voice database. The experimental results show that bimodal input performs better than a single input. An accuracy of 95.65% is obtained for the proposed system.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据