Journal
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Volume 72, Issue -, Pages -Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIM.2023.3244849
Keywords
Electrodes; Speech recognition; Arrays; Muscles; Transfer learning; Neck; Vocabulary; Convolutional neural network (CNN); high-density surface EMG (sEMG); myoelectric control; silent speech recognition (SSR); transfer learning
Ask authors/readers for more resources
The study aims to develop a nonacoustic modality of silent speech recognition (SSR) that transfers knowledge learned from high-density electrode array to a system using a few channels, with both high portability and performance. A convolutional neural network (CNN) was established and trained using data recorded from face and neck muscles, and then calibrated through transfer learning to adapt to a new target domain with data recorded by separate electrodes. The proposed method outperformed other classification approaches and showed performance improvements even under electrode shift and cross-user variability conditions.
Although surface electromyogram (sEMG) recorded from high-density electrode array is believed to carry sufficient spatial information that can benefit the decoding of motor intentions, the complexity of using the array hindered its widespread applications, especially in wearable devices. This study is aimed to develop a nonacoustic modality of silent speech recognition (SSR) that transfers knowledge learned from high-density array to a system using a few channels, with both high portability and performance. A convolutional neural network (CNN) was established for recognizing a vocabulary of 33 Chinese words during subvocal speech production. The network was trained by the data recorded from face and neck muscles using two arrays with 64 channels in the source domain. Then, it was calibrated through a transfer learning approach to grant its adaption to a new target domain with the data recorded by eight separated electrodes, while its good capability of characterizing subvocal speech word patterns is expected to be maintained. The proposed method significantly outperformed three common classification approaches and the baseline approach without transfer learning (a network trained with data just from the target domain). Under conditions of electrode shift and cross-user variability, it still obtained performance improvements. The method is demonstrated to be viable for transfer learning across domains of electrode settings and it facilitates to improve the performance of SSR systems using separate electrode sites under the guidance from high density of arrays.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available