4.6 Article

Unconstrained feedback controller design using Q-learning from noisy data

期刊

COMPUTERS & CHEMICAL ENGINEERING
卷 177, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compchemeng.2023.108325

关键词

Reinforcement learning; Q-learning; Least squares policy iteration; System identification; Maximum likelihood estimation; Linear quadratic regulator

向作者/读者索取更多资源

This paper proposes a novel model-free Q-learning approach to estimate linear feedback controllers from noisy process data. The approach is modified to handle unknown noise covariances and is applied to estimate feedback controllers for linear systems with both process and measurement noise. A model-based approach is also presented for comparison.
This paper develops a novel model-free Q-learning based approach to estimate linear, unconstrained feedback controllers from noisy process data. The proposed method is based on an extension of an available approach developed to estimate the linear quadratic regulator (LQR) for linear systems with full state measurements driven by Gaussian process noise of known covariance. First, we modify the approach to treat the case of an unknown noise covariance. Then, we use the modified approach to estimate a feedback controller for linear systems with both process and measurement noise and only output measurements. We also present a model-based maximum likelihood estimation (MLE) approach to determine a linear dynamic model and noise covariances from data, which is used to construct a regulator and state estimator for comparisons in simulation studies. The performances of the model-free and model-based controller estimation approaches are compared with an example heating, ventilation, and air-conditioning (HVAC) system. We show that the proposed Q-learning approach estimates a reasonably accurate feedback controller from 24 h of noisy data. The controllers estimated using both the model-free and model-based approaches provide similar closed-loop performances with 3.5 and 2.7% losses respectively, compared to a perfect controller that uses the true dynamic model and noise covariances of the HVAC system. Finally, we give future work directions for the model-free controller design approaches by discussing some remaining advantages of the model-based approaches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据