4.3 Article

Reinforcement Learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TSMCB.2011.2166384

关键词

Adaptive critic; dynamic programming (DP); Lyapunov method; neural networks (NNs); online approximators (OLAs); online learning; reinforcement learning

资金

  1. NSF [ECCS 0624644, ECCS 0901562]
  2. NSF of China [60804064]
  3. Qianjiang Program of Zhejiang Province [2011R10024]

向作者/读者索取更多资源

In this paper, reinforcement learning state- and output-feedback-based adaptive critic controller designs are proposed by using the online approximators (OLAs) for a general multi-input and multioutput affine unknown nonlinear discrete-time systems in the presence of bounded disturbances. The proposed controller design has two entities, an action network that is designed to produce optimal signal and a critic network that evaluates the performance of the action network. The critic estimates the cost-to-go function which is tuned online using recursive equations derived from heuristic dynamic programming. Here, neural networks (NNs) are used both for the action and critic whereas any OLAs, such as radial basis functions, splines, fuzzy logic, etc., can be utilized. For the output-feedback counterpart, an additional NN is designated as the observer to estimate the unavailable system states, and thus, separation principle is not required. The NN weight tuning laws for the controller schemes are also derived while ensuring uniform ultimate boundedness of the closed-loop system using Lyapunov theory. Finally, the effectiveness of the two controllers is tested in simulation on a pendulum balancing system and a two-link robotic arm system.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据