☆ 4.5 Article

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS (2021)

期刊

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS

卷 13, 期 4, 页码 806-817

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCDS.2020.2974509

关键词

Optimization; Neural networks; Games; Perturbation methods; Learning (artificial intelligence); Electronic mail; Analytical models; Adversarial attack; reinforcement learning (RL)

类别

Computer Science, Artificial Intelligence Robotics Neurosciences

资金

National Research Foundation, Singapore under its AI Singapore Programme [AISG-RP-2018-004]
Data Science and Artificial Intelligence Research Center at Nanyang Technological University

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Recent studies show that neural-network-based policies can be easily fooled by adversarial examples. This article explores the limits of a model's vulnerability by defining three key settings for minimalistic attacks and testing their potency on six Atari games. The findings reveal significant performance degradation and deception of state-of-the-art policies by minimal perturbations.

Recent studies have revealed that neural-network-based policies can be easily fooled by adversarial examples. However, while most prior works analyze the effects of perturbing every pixel of every frame assuming white-box policy access, in this article, we take a more restrictive view toward adversary generation-with the goal of unveiling the limits of a model's vulnerability. In particular, we explore minimalistic attacks by defining three key settings: 1) Black-Box Policy Access: where the attacker only has access to the input (state) and output (action probability) of an RL policy; 2) Fractional-State Adversary: where only several pixels are perturbed, with the extreme case being a single-pixel adversary; and 3) Tactically Chanced Attack: where only significant frames are tactically chosen to be attacked. We formulate the adversarial attack by accommodating the three key settings, and explore their potency on six Atari games by examining four fully trained state-of-the-art policies. In Breakout, for example, we surprisingly find that: 1) all policies showcase significant performance degradation by merely modifying 0.01% of the input state and 2) the policy trained by DQN is totally deceived by perturbing only 1% frames.

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

期刊

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Minimalistic Attacks: How Little It Takes to Fool Deep Reinforcement Learning Policies

期刊

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文