☆ 4.6 Article

Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks

NEUROCOMPUTING (2023)

Journal

NEUROCOMPUTING

Volume 531, Issue -, Pages 180-194

Publisher

ELSEVIER

DOI: 10.1016/j.neucom.2023.02.013

Keywords

Machine learning security; Adversarial examples; Adversarial robustness; Adversarial attacks detection; Deep neural networks

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a novel defence method to improve the adversarial robustness of DNN classifiers without using adversarial training. This method introduces two new loss functions to punish overconfidence and protect the network from non-targeted attacks. It also presents a new robustness diagram to analyze and visualize the network's robustness against adversarial attacks and a Log-Softmax-pattern-based adversarial attack detection method.

With the widespread applications of Deep Neural Networks (DNNs), the safety of DNNs has become a sig-nificant issue. The vulnerability of the neural networks against adversarial examples deepens concerns about the safety of DNNs applications. This paper proposed a novel defence method to improve the adver-sarial robustness of DNN classifiers without using adversarial training. This method introduces two new loss functions. First, a zero-cross-entropy loss is used to punish overconfidence and find the appropriate confidence for different instances. Second, a logit balancing loss is proposed to protect DNNs from non-targeted attacks by regularising incorrect classes' logits distribution. This method achieved competitive adversarial robustness compared to advanced adversarial training methods. Meanwhile, a novel robust-ness diagram is proposed to analyse, interpret and visualise the robustness of DNN classifiers against adversarial attacks. Furthermore, a Log-Softmax-pattern-based adversarial attack detection method is proposed. This detection method can distinguish clean inputs and multiple adversarial attacks via one multi-classification MLP. In particular, it is state-of-the-art in identifying white-box gradient-based attacks; it achieved at least 95.5% accuracy for classifying four white-box gradient-based attacks with maximum 0.1% false positive ratio. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper