☆ 4.6 Article

Diversity-augmented intrinsic motivation for deep reinforcement learning

NEUROCOMPUTING (2022)

期刊

NEUROCOMPUTING

卷 468, 期 -, 页码 396-406

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2021.10.040

关键词

Deep reinforcement learning; Curiosity-driven exploration; Determinantal point process

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In reinforcement learning, utilizing intrinsic reward signals can assist agents in exploring environments more effectively and providing denser rewards automatically. Moreover, measuring diversity using the DPP model is beneficial for accelerating training and achieving better performance.

In many real-world problems, reward signals received by agents are delayed or sparse, which makes it challenging to train a reinforcement learning (RL) agent. An intrinsic reward signal can help an agent to explore such environments in the quest for novel states. In this work, we propose a general end-to -end diversity-augmented intrinsic motivation for deep reinforcement learning which encourages the agent to explore new states and automatically provides denser rewards. Specifically, we measure the diversity of adjacent states under a model of state sequences based on determinantal point process (DPP); this is coupled with a straight-through gradient estimator to enable end-to-end differentiability. The proposed approach is comprehensively evaluated on the MuJoCo and the Arcade Learning Environments (Atari and SuperMarioBros). The experiments show that an intrinsic reward based on the diversity measure derived from the DPP model accelerates the early stages of training in Atari games and SuperMarioBros. In MuJoCo, the approach improves on prior techniques for tasks using the standard reward setting, and achieves the state-of-the-art performance on 12 out of 15 tasks containing delayed rewards. (c) 2021 Elsevier B.V. All rights reserved.

Diversity-augmented intrinsic motivation for deep reinforcement learning

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Diversity-augmented intrinsic motivation for deep reinforcement learning

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文