☆ 4.5 Article

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS (2021)

Journal

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS

Volume 13, Issue 4, Pages 806-817

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCDS.2020.2974509

Keywords

Optimization; Neural networks; Games; Perturbation methods; Learning (artificial intelligence); Electronic mail; Analytical models; Adversarial attack; reinforcement learning (RL)

Funding

National Research Foundation, Singapore under its AI Singapore Programme [AISG-RP-2018-004]
Data Science and Artificial Intelligence Research Center at Nanyang Technological University

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Recent studies show that neural-network-based policies can be easily fooled by adversarial examples. This article explores the limits of a model's vulnerability by defining three key settings for minimalistic attacks and testing their potency on six Atari games. The findings reveal significant performance degradation and deception of state-of-the-art policies by minimal perturbations.

Recent studies have revealed that neural-network-based policies can be easily fooled by adversarial examples. However, while most prior works analyze the effects of perturbing every pixel of every frame assuming white-box policy access, in this article, we take a more restrictive view toward adversary generation-with the goal of unveiling the limits of a model's vulnerability. In particular, we explore minimalistic attacks by defining three key settings: 1) Black-Box Policy Access: where the attacker only has access to the input (state) and output (action probability) of an RL policy; 2) Fractional-State Adversary: where only several pixels are perturbed, with the extreme case being a single-pixel adversary; and 3) Tactically Chanced Attack: where only significant frames are tactically chosen to be attacked. We formulate the adversarial attack by accommodating the three key settings, and explore their potency on six Atari games by examining four fully trained state-of-the-art policies. In Breakout, for example, we surprisingly find that: 1) all policies showcase significant performance degradation by merely modifying 0.01% of the input state and 2) the policy trained by DQN is totally deceived by perturbing only 1% frames.

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

Journal

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

Journal

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper