☆ 4.6 Article

On the design of automatic voice condition analysis systems. Part II: Review of speaker recognition techniques and study on the effects of different variability factors

BIOMEDICAL SIGNAL PROCESSING AND CONTROL (2019)

期刊

BIOMEDICAL SIGNAL PROCESSING AND CONTROL

卷 48, 期 -, 页码 128-143

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.bspc.2018.09.003

关键词

Robust automatic voice condition analysis; Universal background models; Extralinguistic aspects of the speech; Cross-dataset validation

类别

Engineering, Biomedical

资金

Ministry of Economy and Competitiveness of Spain [DPI2017-83405-R1]
Becas de Ayuda a la Movilidad of the Universidad Politecnica de Madrid

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This is the second of a two-part series devoted to the automatic voice condition analysis of voice pathologies, being a direct continuation to the paper On the design of automatic voice condition analysis systems. Part 1: review of concepts and an insight to the state of the art. The aim of this study is to examine several variability factors affecting the robustness of systems that automatically detect the presence of voice pathologies by means of audio registers. Multiple experiments are performed to test out the influence of the speech task, extralinguistic aspects (such as sex), the acoustic features and the classifiers in their performance. Some experiments are carried out using state-of-the-art classification methodologies often employed in speaker recognition. In order to evaluate the robustness of the methods, testing is repeated across several corpora with the aim to create a single system integrating the conclusions obtained previously. This system is later tested under cross-dataset scenarios in an attempt to obtain more realistic conclusions. Results identify a reduced subset of relevant features, which are used in a hierarchical-like scenario incorporating information of different speech tasks. In particular, for the experiments carried out using the Saarbruecken voice dataset, the area under the ROC curve of the system reached 0.88 in an intra-dataset setting and ranged from 0.82 to 0.94 in cross-dataset scenarios. These results let us open a discussion about the suitability of these techniques to be transferred to the clinical setting. (C) 2018 Elsevier Ltd. All rights reserved.

On the design of automatic voice condition analysis systems. Part II: Review of speaker recognition techniques and study on the effects of different variability factors

期刊

BIOMEDICAL SIGNAL PROCESSING AND CONTROL

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

On the design of automatic voice condition analysis systems. Part II: Review of speaker recognition techniques and study on the effects of different variability factors

期刊

BIOMEDICAL SIGNAL PROCESSING AND CONTROL

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文