☆ 3.8 Proceedings Paper

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22) (2022)

期刊

PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22)

卷 -, 期 -, 页码 1102-1111

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3512290.3528705

关键词

quality diversity; reinforcement learning; neuroevolution

类别

Computer Science, Cybernetics Computer Science, Theory & Methods Robotics

资金

NSF NRI [1053128]
NSF GRFP [DGE-1842487]
Direct For Social, Behav & Economic Scie
Division Of Behavioral and Cognitive Sci [1053128] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper discusses the problem of training robust agents and proposes a method that combines quality diversity and reinforcement learning. By approximating gradients, the proposed method can be applied to training agent policies. The results show that the method achieves comparable performance in certain locomotion tasks.

Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks. These results provide insight into the limitations of current DQD algorithms in domains where gradients must be approximated. Source code is available at https://github.com/icaros-usc/dqd-rl

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

期刊

PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22)

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

期刊

PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22)

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文