☆ 4.6 Article

Sub-AVG: Overestimation reduction for cooperative multi-agent reinforcement learning

NEUROCOMPUTING (2022)

期刊

NEUROCOMPUTING

卷 474, 期 -, 页码 94-106

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2021.12.039

关键词

Cooperative multi-agent reinforcement learning; Joint action value decomposition; Overestimation error; Lower update target

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Decomposing the centralized joint action value into per-agent individual action value is attractive in cooperative multi-agent reinforcement learning. However, the Q-learning-based method suffers from overestimation. This paper presents a solution called Sub-AVG, which eliminates excessive overestimation errors by using a lower update target.

Decomposing the centralized joint action value(JAV) into per-agent individual action value(IAV) is attrac-tive in cooperative multi-agent reinforcement learning(MARL). In such tasks, IAVs based on local obser-vation can perform decentralized policies, and the JAV is used for end-to-end training through traditional reinforcement learning methods, especially through the Q-learning algorithm. However, the Q-learning-based method suffers from overestimation, in which the overestimated action values may result in a sub -optimal policy. In this paper, we show that such overestimation can occur in the above Q-learning-based decomposition method. Our solution is Sub-AVG, which utilizes a lower update target by discarding the larger of previously learned IAVs and averaging the retained ones, thus eliminating the excessive overes-timation errors. Experiments in the StarCraft Multi-Agent Challenge(SMAC) environment show that Sub -AVG can lead to lower JAV estimations and better-performing policies. (c) 2021 Elsevier B.V. All rights reserved.

Sub-AVG: Overestimation reduction for cooperative multi-agent reinforcement learning

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Sub-AVG: Overestimation reduction for cooperative multi-agent reinforcement learning

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文