☆ 4.6 Article

A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward

NEUROCOMPUTING (2021)

期刊

NEUROCOMPUTING

卷 424, 期 -, 页码 23-34

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2020.11.014

关键词

Adaptive critic designs; Adaptive dynamic programming; Policy iteration; Neural networks; Neuro-dynamic programming; Nonlinear systems; Optimal control

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a partial policy iteration adaptive dynamic programming algorithm to solve the optimal control problem of nonlinear systems. By updating the control law locally, the algorithm reduces computational burden and can be successfully executed on low-performance devices, with convergence analysis and theoretical development provided.

This paper constructs a partial policy iteration adaptive dynamic programming (ADP) algorithm to solve the optimal control problem of nonlinear systems with discounted total reward. Compared with traditional policy iteration ADP algorithm, the approach updates the iterative control law only in a local region of the global system state space. With the benefit of this feature, the overall computational burden at each iteration for processing units can be significantly reduced. Hence, this feature enables our algorithm to be successfully executed on low-performance devices such as smartphones, smartwatches and the Internet of Things (IoT) objects. We provide the convergency analysis to show that the generated sequence of value functions is monotonically nonincreasing and can finally reach a local optimum. In addition, the corresponding local policy space is developed theoretically for the first time. Besides, when the sequence of the local system state spaces is chosen properly, we prove that the developed algorithm is capable of finding the global optimal performance index function for the nonlinear systems. Finally, we present a numerical simulation to demonstrate the effectiveness of the proposed algorithm. (c) 2020 Elsevier B.V. All rights reserved.

A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文