4.7 Article

Dynamic time warping in phoneme modeling for fast pronunciation error detection

期刊

COMPUTERS IN BIOLOGY AND MEDICINE
卷 69, 期 -, 页码 277-285

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compbiomed.2015.12.004

关键词

Pronunciation error detection; CAPT systems; DTW algorithm; Phoneme modeling; Word structure analysis

向作者/读者索取更多资源

The presented paper describes a novel approach to the detection of pronunciation errors. It makes use of the modeling of well-pronounced and mispronounced phonemes by means of the Dynamic Time Warping (DTW) algorithm. Four approaches that make use of the DTW phoneme modeling were developed to detect pronunciation errors: Variations of the Word Structure (VoWS), Normalized Phoneme Distances Thresholding (NPDT), Furthest Segment Search (FSS) and Normalized Furthest Segment Search (NFSS). The performance evaluation of each module was carried out using a speech database of correctly and incorrectly pronounced words in the Polish language, with up to 10 patterns of every trained word from a set of 12 words having different phonetic structures. The performance of DTW modeling was compared to Hidden Markov Models (HMM) that were used for the same four approaches (VoWS, NPDT, FSS, NFSS). The average error rate (AER) was the lowest for DTW with NPDT (AER=0.287) and scored better than HMM with FSS (AER=0.473), which was the best result for HMM. The DTW modeling was faster than HMM for all four approaches. This technique can be used for computer-assisted pronunciation training systems that can work with a relatively small training speech corpus (less than 20 patterns per word) to support speech therapy at home. (C) 2015 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据