4.7 Article

Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2020.2969215

关键词

Games; Uncertainty; System dynamics; Nash equilibrium; Optimal control; Time-varying systems; Integral reinforcement learning (IRL); multiplayer systems; neural networks (NNs); systems with randomly time-varying parameters; uncertainty quantification

资金

  1. Office of Naval Research (ONR) [N00014-18-1-2221]
  2. NSF [1714519, 1839804]
  3. Directorate For Engineering
  4. Div Of Electrical, Commun & Cyber Sys [1839804] Funding Source: National Science Foundation

向作者/读者索取更多资源

Control-theoretic differential games have been used to solve optimal control problems in multiplayer systems. Most existing studies on differential games either assume deterministic dynamics or dynamics corrupted with additive noise. In realistic environments, multidimensional environmental uncertainties often modulate system dynamics in a more complicated fashion. In this article, we study stochastic multiplayer differential games, where the players' dynamics are modulated by randomly time-varying parameters. We first formulate two differential games for systems of general uncertain linear dynamics, including the two-player zero-sum and multiplayer nonzero-sum games. We then show that optimal control policies, which constitute the Nash equilibrium solutions, can be derived from the corresponding Hamiltonian functions. Stability is proven using the Lyapunov type of analysis. In order to solve the stochastic differential games online, we integrate reinforcement learning (RL) and an effective uncertainty sampling method called the multivariate probabilistic collocation method (MPCM). Two learning algorithms, including the on-policy integral RL (IRL) and off-policy IRL, are designed for the formulated games, respectively. We show that the proposed learning algorithms can effectively find the Nash equilibrium solutions for the stochastic multiplayer differential games.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据