☆ 4.7 Review

Adversarial Learning Targeting Deep Neural Network Classification: A Comprehensive Review of Defenses Against Attacks

PROCEEDINGS OF THE IEEE (2020)

Journal

PROCEEDINGS OF THE IEEE

Volume 108, Issue 3, Pages 402-433

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/JPROC.2020.2970615

Keywords

Training data; Neural networks; Reverse engineering; Machine learning; Robustness; Training data; Feature extraction; Social networking (online); Informatics; Adversarial machine learning; Anomaly detection (AD); backdoor; black box; data poisoning (DP); deep neural networks (DNNs); membership inference attack; reverse engineering (RE); robust classification; targeted attacks; test-time-evasion (TTE); transferability; white box

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

With wide deployment of machine learning (ML)-based systems for a variety of applications including medical, military, automotive, genomic, multimedia, and social networking, there is great potential for damage from adversarial learning (AL) attacks. In this article, we provide a contemporary survey of AL, focused particularly on defenses against attacks on deep neural network classifiers. After introducing relevant terminology and the goals and range of possible knowledge of both attackers and defenders, we survey recent work on test-time evasion (TTE), data poisoning (DP), backdoor DP, and reverse engineering (RE) attacks and particularly defenses against the same. In so doing, we distinguish robust classification from anomaly detection (AD), unsupervised from supervised, and statistical hypothesis-based defenses from ones that do not have an explicit null (no attack) hypothesis. We also consider several scenarios for detecting backdoors. We provide a technical assessment for reviewed works, including identifying any issues/limitations, required hyperparameters, needed computational complexity, as well as the performance measures evaluated and the obtained quality. We then delve deeper, providing novel insights that challenge conventional AL wisdom and that target unresolved issues, including: robust classification versus AD as a defense strategy; the belief that attack success increases with attack strength, which ignores susceptibility to AD; small perturbations for TTE attacks: a fallacy or a requirement; validity of the universal assumption that a TTE attacker knows the ground-truth class for the example to be attacked; black, gray, or white-box attacks as the standard for defense evaluation; and susceptibility of query-based RE to an AD defense. We also discuss attacks on the privacy of training data. We then present benchmark comparisons of several defenses against TTE, RE, and backdoor DP attacks on images. The article concludes with a discussion of continuing research directions, including the supreme challenge of detecting attacks whose goal is not to alter classification decisions, but rather simply to embed, without detection, fake news or other false content.

Adversarial Learning Targeting Deep Neural Network Classification: A Comprehensive Review of Defenses Against Attacks

Journal

PROCEEDINGS OF THE IEEE

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Adversarial Learning Targeting Deep Neural Network Classification: A Comprehensive Review of Defenses Against Attacks

Journal

PROCEEDINGS OF THE IEEE

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper