4.7 Article

Decentralized Adaptive Optimal Tracking Control for Massive Autonomous Vehicle Systems With Heterogeneous Dynamics: A Stackelberg Game

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2021.3100417

关键词

Approximate dynamic programming (ADP); mean-field games (MFGs); reinforcement learning; Stackelberg game

资金

  1. U.S. Office of the Under Secretary of Defense for Research and Engineering (OUSD(RE)) [FA8750-15-2-0119]

向作者/读者索取更多资源

This article discusses the decentralized optimal tracking control problem for a large-scale autonomous vehicle system with heterogeneous system dynamics. The study introduces the mean-field game theory and proposes a novel mean-field Stackelberg game method to address the challenges faced by traditional algorithms. A specialized A(2)C(2) M algorithm is designed to learn optimal policies, with numerical simulations conducted to demonstrate the method's effectiveness.
In this article, a decentralized optimal tracking control problem has been studied for a large-scale autonomous vehicle system with heterogeneous system dynamics. Due to the ultralarge number of agents, the notorious curse of dimension problem as well as the unrealistic assumption of the existence of reliable very large-scale communication links in uncertain environments have challenged the traditional multiagent system (MAS) algorithms for decades. The emerging mean-field game (MFG) theory has recently been widely adopted to generate a decentralized control method that deals with those challenges by encoding the large scale MASs' information into a novel time-varying probability density functions (PDF) which can be obtained locally. However, the traditional MFG methods assume all agents are homogeneous, which is unrealistic in practical industrial applications, e.g., Internet of Things (IoTs), and so on. Therefore, a novel mean-field Stackelberg game (MFSG) is formulated based on the Stackelberg game, where all the agents have been classified as two different categories where one major leader's decision dominates the other minor agents. Moreover, a hierarchical structure that treats all minor agents as a mean-field group is developed to tackle the assumption of homogeneous agents. Then, the actor-actor-critic-critic-mass (A(2)C(2) M) algorithm with five neural networks is designed to learn the optimal policies by solving the MFSG. The Lyapunov theory is utilized to prove the convergence of A(2)C(2) M neural networks and the closed-loop system's stability. Finally, a series of numerical simulations are conducted to demonstrate the effectiveness of the developed method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据