☆ 4.2 Review

Black-box adversarial attacks through speech distortion for speech emotion recognition

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING (2022)

Journal

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

Volume 2022, Issue 1, Pages -

Publisher

SPRINGER

DOI: 10.1186/s13636-022-00254-7

Keywords

Convolutional Neural Network; Robustness; Speech emotion recognition; Adversarial attack; Adversarial training

Funding

National Natural Science Foundation of China [61300055]
Zhejiang Natural Science Foundation [LY20F020010]
Ningbo Natural Science Foundation [202003N4089]
K.C. Wong Magna Fund in Ningbo University

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This article introduces speech emotion recognition as a key branch of affective computing and explores the performance of different emotion recognition methods. The author improves the robustness of the model by using black-box attacks and finds the effectiveness of adversarial training in combating attacks.

Speech emotion recognition is a key branch of affective computing. Nowadays, it is common to detect emotional diseases through speech emotion recognition. Various detection methods of emotion recognition, such as LTSM, GCN, and CNN, show excellent performance. However, due to the robustness of the model, the recognition results of the above models will have a large deviation. So in this article, we use black boxes to combat sample attacks to explore the robustness of the model. After using three different black-box attacks, the accuracy of the CNN-MAA model decreased by 69.38% at the best attack scenario, while the word error rate (WER) of voice decreased by only 6.24%, indicating that the robustness of the model does not perform well under our black-box attack method. After adversarial training, the model accuracy only decreased by 13.48%, which shows the effectiveness of adversarial training against sample attacks. Our code is available in Github.

Black-box adversarial attacks through speech distortion for speech emotion recognition

Journal

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Black-box adversarial attacks through speech distortion for speech emotion recognition

Journal

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper