期刊
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
卷 27, 期 9, 页码 1469-1480出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TASLP.2019.2913499
关键词
Speech question answering; TOEFL; SQuAD; attention model; deep learning
资金
- Ministry of Science and Technology of Taiwan
A user can scan through a text easily, but it is not the case for spoken content, because they cannot be directly displayed on-screen. As a result, accessing large collections of spoken content is much more difficult and time-consuming than doing so for the text content. It would therefore he helpful to develop machines that understand spoken content. In this paper, we propose two new tasks for machine comprehension of spoken content. The first is a listening comprehension test for TOEFL, a challenging academic English examination for English learners who are not the native English speakers. We show that the proposed model outperforms the naive approaches and other neural network based models by exploiting the hierarchical structures of natural languages and the selective power of attention mechanism. For the second listening comprehension task - spoken SQuAD - we find that speech recognition errors severely impair machine comprehension; we propose the use of subword units to mitigate the impact of these errors.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据