期刊
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS
卷 14, 期 4, 页码 1666-1677出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCDS.2021.3137251
关键词
Brain-computer interface (BCI); electroen-cephalography (EEG); feature extraction; lexical tone; machine learning; Mandarin; speech recognition
资金
- Ministry of Science and Technology, Taiwan [MOST 110-2221-E-A49-130-MY2, MOST 109-2221-E-009-050-MY2, MOST 110-2634-F-009-021-MY2, MOST110-2221-E-A49-039-MY3]
- Center for Open IntelligentConnectivity from The Featured Areas Research Center Program
The focus of the research is to determine whether EEG signals can distinguish between spoken Mandarin sentences with or without tone. By using the BRCSpeech database and RASM feature extraction method, the study achieved high accuracy in cross-subject classification, which could help develop a tonal language synthesis system based on BCI in the future.
Most current research has focused on nontonal languages such as English. However, more than 60% of the world's population speaks tonal languages. Mandarin is the most spoken tonal languages in the world. Interestingly, the use of tone in tonal languages may represent different meanings of words and reflect feelings, which is very different from nontonal languages. The objective of this study is to determine whether a spoken Mandarin sentence with or without tone can be distinguished by analyzing electroencephalographic (EEG) signals. We first constructed a new Brain Research Center Speech (BRCSpeech) database to recognize Mandarin. The EEG data of 14 participants were recorded, while they articulated preselected sentences. To the best of our knowledge, this is the first study to apply the method of asymmetric feature extraction method for speech recognition using EEG signals. This study shows that the feature extraction method of rational asymmetry (RASM) can achieve the best accuracy in the classification of cross-subjects. In addition, our proposed binomial variable algorithm methodology can achieve 98.82% accuracy in cross-subject classification. Furthermore, we demonstrate that the use of eight channels [(F7, F8), (C5, C6), (P5, P6), and (O1, O2)] can achieve an accurate of 94.44%. This study explores the neurophysiological correlation of Mandarin pronunciation, which can help develop a tonal language synthesis system based on BCI in the future.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据