4.7 Article

Linear quadratic tracking control of unknown systems: A two-phase reinforcement learning method

Journal

AUTOMATICA
Volume 148, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.automatica.2022.110761

Keywords

Reinforcement learning; Linear quadratic tracking control; Discounted cost function; Singular perturbation theory

Ask authors/readers for more resources

This paper addresses the issue of linear quadratic tracking control (LQTC) with a discounted cost function for unknown systems. Existing design methods often require a small discount factor for closed-loop stability, but solving the discounted algebraic Riccati equation may lead to ill-conditioned numerical problems with a small discount factor. By using singular perturbation theory, the full-order discounted Riccati equation is decomposed into a reduced-order Riccati equation and a Sylvester equation, allowing for the design of feedback and feedforward control gains. The resulting controller is proven to be stabilizing and near-optimal in solving the original LQTC problem. In the framework of reinforcement learning, on-policy and off-policy two-phase learning algorithms are derived for designing a near-optimal tracking control policy without prior knowledge of the discount factor. Comparative simulation results are provided to demonstrate the advantages of the proposed approach.
This paper considers the problem of linear quadratic tracking control (LQTC) with a discounted cost function for unknown systems. The existing design methods often require the discount factor to be small enough to guarantee the closed-loop stability. However, solving the discounted algebraic Riccati equation (ARE) may lead to ill-conditioned numerical issues if the discount factor is too small. By singular perturbation theory, we decompose the full-order discounted ARE into a reduced-order ARE and a Sylvester equation, which facilitate designing the feedback and feedforward control gains. The obtained controller is proved to be a stabilizing and near-optimal solution to the original LQTC problem. In the framework of reinforcement learning, both on-policy and off-policy two-phase learning algorithms are derived to design the near-optimal tracking control policy without knowing the discount factor. The advantages of the developed results are illustrated by comparative simulation results. (c) 2022 Published by Elsevier Ltd.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available