4.7 Article

Silent Speech Recognition Based on Surface Electromyography Using a Few Electrode Sites Under the Guidance From High-Density Electrode Arrays

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIM.2023.3244849

Keywords

Electrodes; Speech recognition; Arrays; Muscles; Transfer learning; Neck; Vocabulary; Convolutional neural network (CNN); high-density surface EMG (sEMG); myoelectric control; silent speech recognition (SSR); transfer learning

Ask authors/readers for more resources

The study aims to develop a nonacoustic modality of silent speech recognition (SSR) that transfers knowledge learned from high-density electrode array to a system using a few channels, with both high portability and performance. A convolutional neural network (CNN) was established and trained using data recorded from face and neck muscles, and then calibrated through transfer learning to adapt to a new target domain with data recorded by separate electrodes. The proposed method outperformed other classification approaches and showed performance improvements even under electrode shift and cross-user variability conditions.
Although surface electromyogram (sEMG) recorded from high-density electrode array is believed to carry sufficient spatial information that can benefit the decoding of motor intentions, the complexity of using the array hindered its widespread applications, especially in wearable devices. This study is aimed to develop a nonacoustic modality of silent speech recognition (SSR) that transfers knowledge learned from high-density array to a system using a few channels, with both high portability and performance. A convolutional neural network (CNN) was established for recognizing a vocabulary of 33 Chinese words during subvocal speech production. The network was trained by the data recorded from face and neck muscles using two arrays with 64 channels in the source domain. Then, it was calibrated through a transfer learning approach to grant its adaption to a new target domain with the data recorded by eight separated electrodes, while its good capability of characterizing subvocal speech word patterns is expected to be maintained. The proposed method significantly outperformed three common classification approaches and the baseline approach without transfer learning (a network trained with data just from the target domain). Under conditions of electrode shift and cross-user variability, it still obtained performance improvements. The method is demonstrated to be viable for transfer learning across domains of electrode settings and it facilitates to improve the performance of SSR systems using separate electrode sites under the guidance from high density of arrays.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available