期刊
IEEE TRANSACTIONS ON CYBERNETICS
卷 52, 期 7, 页码 6046-6058出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCYB.2020.3044595
关键词
Optimal control; Artificial neural networks; Stability analysis; Control systems; Adaptation models; Training; Dynamic programming; Adaptive control; event-triggered control (ETC); heuristic dynamic programming (HDP); nonlinear discrete-time (NDT) systems; temporal difference (TD) (λ )
类别
资金
- National Key Research and Development Program of China [2018YFB1700500]
- State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources [LAPS19005]
- National Natural Science Foundation of China [62022044]
- Jiangsu Natural Science Foundation for Distinguished Young Scholars [BK20190039]
The proposed event-triggered HDP (ETHDP) (lambda) optimal control strategy aims to handle nonlinear discrete-time systems with unknown dynamics by reducing computation and communication requirements through event-triggered conditions.
The heuristic dynamic programming (HDP) (lambda)-based optimal control strategy, which takes a long-term prediction parameter lambda into account using an iterative manner, accelerates the learning rate obviously. The computation complexity caused by the state-associated extra variable in lambda-return value computing of the traditional value-gradient learning method can be reduced. However, as the iteration number increases, calculation costs have grown dramatically that bring huge challenge for the optimal control process with limited bandwidth and computational units. In this article, we propose an event-triggered HDP (ETHDP) (lambda) optimal control strategy for nonlinear discrete-time (NDT) systems with unknown dynamics. The iterative relation for lambda-return of the final target value is derived first. The event-triggered condition ensuring system stability is designed to reduce the computation and communication requirements. Next, we build a model-actor-critic neural network (NN) structure, in which the model NN evaluates the system state for getting lambda-return of the current time target value, which is used to obtain the critic NN real-time update errors. The event-triggered optimal control signal and one-step-return value are approximated by actor and critic NN, respectively. Then, the event trigger-based uniformly ultimately bounded (UUB) stability of the system state and NN weight errors are demonstrated by applying the Lyapunov technology. Finally, we illustrate the effectiveness of our proposed ETHDP (lambda) strategy by two cases.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据