☆ 4.7 Article

Actor-Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2015)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 26, 期 1, 页码 140-151

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2014.2358227

关键词

Actor-critic algorithm; discrete-time (DT) nonlinear optimal tracking; input constraints; neural network (NN); reinforcement learning (RL)

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

National Science Foundation [ECCS-1405173, IIS-1208623]
U.S. Office of Naval Research, Arlington, VA, USA [N00014-13-1-0562]
Air Force Office of Scientific Research, Arlington, VA, USA, through the European Office of Aerospace Research and Development [13-3055]
National Natural Science Foundation of China [61120106011]
Ministry of Education, China [B08015]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This paper presents a partially model-free adaptive optimal control solution to the deterministic nonlinear discrete-time (DT) tracking control problem in the presence of input constraints. The tracking error dynamics and reference trajectory dynamics are first combined to form an augmented system. Then, a new discounted performance function based on the augmented system is presented for the optimal nonlinear tracking problem. In contrast to the standard solution, which finds the feedforward and feedback terms of the control input separately, the minimization of the proposed discounted performance function gives both feedback and feedforward parts of the control input simultaneously. This enables us to encode the input constraints into the optimization problem using a nonquadratic performance function. The DT tracking Bellman equation and tracking Hamilton-Jacobi-Bellman (HJB) are derived. An actor-critic-based reinforcement learning algorithm is used to learn the solution to the tracking HJB equation online without requiring knowledge of the system drift dynamics. That is, two neural networks (NNs), namely, actor NN and critic NN, are tuned online and simultaneously to generate the optimal bounded control policy. A simulation example is given to show the effectiveness of the proposed method.

Actor-Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Actor-Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文