4.6 Article

Optimal consensus control for multi-agent systems: Multi-step policy gradient adaptive dynamic programming method

期刊

IET CONTROL THEORY AND APPLICATIONS
卷 17, 期 11, 页码 1443-1457

出版社

WILEY
DOI: 10.1049/cth2.12473

关键词

complex networks; dynamic programming; intelligent control; multi-agent systems; optimal control

向作者/读者索取更多资源

This paper proposes a novel adaptive dynamic programming (ADP) method to solve the optimal consensus problem for a class of discrete-time multi-agent systems with completely unknown dynamics. A multi-step-based policy gradient ADP (MS-PGADP) algorithm is introduced, which is more efficient due to its faster reward propagation. A new Q-function is defined to estimate the performance of actions. The proof of optimality and stability of the error system are provided through the Lyapunov stability theorem and functional analysis.
This paper presents a novel adaptive dynamic programming (ADP) method to solve the optimal consensus problem for a class of discrete-time multi-agent systems with completely unknown dynamics. Different from the classical RL-based optimal control algorithms based on one-step temporal difference method, a multi-step-based (also call n-step) policy gradient ADP (MS-PGADP) algorithm, which have been proved to be more efficient owing to its faster propagation of the reward, is proposed to obtain the iterative control policies. Moreover, a novel Q-function is defined, which estimates the performance of performing an action in the current state. Then, through the Lyapunov stability theorem and functional analysis, the proof of optimality of the performance index function is given and the stability of the error system is also proved. Furthermore, the actor-critic neural networks are used to implement the proposed method. Inspired by deep Q network, the target network is also introduced to guarantee the stability of NNs in the process of training. Finally, two simulations are conducted to verify the effectiveness of the proposed algorithm.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据