☆ 4.7 Article

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 33, 期 8, 页码 3872-3883

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2021.3054685

关键词

Multi-agent systems; Consensus control; Games; Heuristic algorithms; Dynamic programming; Synchronization; Reinforcement learning; Asynchronous learning; data-based control; nonzero-sum games; optimal distributed consensus control; policy gradient (PG) reinforcement learning (RL)

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

National Natural Science Foundation of China [61922063, 61733013]
Natural Science Foundation of Shanghai [19ZR1461400, 17ZR1445800]
Fundamental Research Funds for the Central Universities

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article introduces a data-based distributed control algorithm to address the consensus control problem in multiagent systems, successfully overcoming the challenges of asynchronous learning. By incorporating an actor-critic structure and neural networks, the algorithm achieves convergence and optimality in both synchronous and asynchronous cases.

This article investigates the optimally distributed consensus control problem for discrete-time multiagent systems with completely unknown dynamics and computational ability differences. The problem can be viewed as solving nonzero-sum games with distributed reinforcement learning (RL), and each agent is a player in these games. First, to guarantee the real-time performance of learning algorithms, a data-based distributed control algorithm is proposed for multiagent systems using offline system interaction data sets. By utilizing the interactive data produced during the run of a real-time system, the proposed algorithm improves system performance based on distributed policy gradient RL. The convergence and stability are guaranteed based on functional analysis and the Lyapunov method. Second, to address asynchronous learning caused by computational ability differences in multiagent systems, the proposed algorithm is extended to an asynchronous version in which executing policy improvement or not of each agent is independent of its neighbors. Furthermore, an actor-critic structure, which contains two neural networks, is developed to implement the proposed algorithm in synchronous and asynchronous cases. Based on the method of weighted residuals, the convergence and optimality of the neural networks are guaranteed by proving the approximation errors converge to zero. Finally, simulations are conducted to show the effectiveness of the proposed algorithm.

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文