4.7 Article

Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2018.2861945

关键词

Affine nonlinear systems; interleaved learning; off-policy learning; optimal control; Q-learning

资金

  1. National Natural Science Foundation of China [61673280, 61525302, 71602124, 61590922, 61503257]
  2. Open Project of the State Key Laboratory of Synthetical Automation for Process Industries [PAL-N201603]
  3. 111 Project [B08015]
  4. Fundamental Research Funds for the Central Universities [N160804001]
  5. Project of Liaoning Province [LR2017006]

向作者/读者索取更多资源

In this paper, a novel off-policy interleaved Q-learning algorithm is presented for solving optimal control problem of affine nonlinear discrete-time (DT) systems, using only the measured data along the system trajectories. Affine nonlinear feature of systems, unknown dynamics, and off-policy learning approach pose tremendous challenges on approximating optimal controllers. To this end, on-policy Q-learning method for optimal control of affine nonlinear DT systems is reviewed first, and its convergence is rigorously proven. The bias of solution to Q-function-based Bellman equation caused by adding probing noises to systems for satisfying persistent excitation is also analyzed when using on-policy Q-learning approach. Then, a behavior control policy is introduced followed by proposing an off-policy Q-learning algorithm. Meanwhile, the convergence of algorithm and no bias of solution to optimal control problem when adding probing noise to systems are investigated. Third, three neural networks run by the interleaved Q-learning approach in the actor-critic framework. Thus, a novel off-policy interleaved Q-learning algorithm is derived, and its convergence is proven. Simulation results are given to verify the effectiveness of the proposed method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据