☆ 4.6 Article

Speech Discrimination in Real-World Group Communication Using Audio-Motion Multimodal Sensing

SENSORS (2020)

期刊

SENSORS

卷 20, 期 10, 页码 -

出版社

MDPI

DOI: 10.3390/s20102948

关键词

speech discrimination; group communication; physical motion; multimodal sensing; sensor fusion; smartphone

类别

Chemistry, Analytical Engineering, Electrical & Electronic Instruments & Instrumentation

资金

KAKENHI from JSPS/MEXT, Japan [JP15H01771, JP17H01753]
JST-COI Grant from Japan Science and Technology Agency [JPMJCE1309]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Speech discrimination that determines whether a participant is speaking at a given moment is essential in investigating human verbal communication. Specifically, in dynamic real-world situations where multiple people participate in, and form, groups in the same space, simultaneous speakers render speech discrimination that is solely based on audio sensing difficult. In this study, we focused on physical activity during speech, and hypothesized that combining audio and physical motion data acquired by wearable sensors can improve speech discrimination. Thus, utterance and physical activity data of students in a university participatory class were recorded, using smartphones worn around their neck. First, we tested the temporal relationship between manually identified utterances and physical motions and confirmed that physical activities in wide-frequency ranges co-occurred with utterances. Second, we trained and tested classifiers for each participant and found a higher performance with the audio-motion classifier (average accuracy 92.2%) than both the audio-only (80.4%) and motion-only (87.8%) classifiers. Finally, we tested inter-individual classification and obtained a higher performance with the audio-motion combined classifier (83.2%) than the audio-only (67.7%) and motion-only (71.9%) classifiers. These results show that audio-motion multimodal sensing using widely available smartphones can provide effective utterance discrimination in dynamic group communications.

Speech Discrimination in Real-World Group Communication Using Audio-Motion Multimodal Sensing

期刊

SENSORS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Speech Discrimination in Real-World Group Communication Using Audio-Motion Multimodal Sensing

期刊

SENSORS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文