期刊
SIAM JOURNAL ON CONTROL AND OPTIMIZATION
卷 61, 期 1, 页码 72-104出版社
SIAM PUBLICATIONS
DOI: 10.1137/22M1476757
关键词
Markov decision processes; risk-sensitive average cost criterion; optimal policies; policy iteration algorithm
In this paper, we investigate the risk-sensitive average optimality in discrete-time Markov decision processes with denumerable states and unbounded costs. By utilizing an approximation method, we derive the multiplicative Poisson equation under suitable ergodicity conditions. Furthermore, we establish the existence of a unique solution to the risk-sensitive average cost optimality equation and provide an equivalent characterization of the set of all optimal stationary policies. Finally, we introduce the policy iteration algorithm and demonstrate its convergence.
In this paper we study the risk-sensitive average optimality for discrete-time Markov decision processes with denumerable states and unbounded costs. We derive the multiplicative Poisson equation under the suitable ergodicity conditions via an approximation method. Moreover, we prove the existence of a unique solution to the risk-sensitive average cost optimality equation and give an equivalent characterization of the set of all optimal stationary policies. Finally, we present the policy iteration algorithm and show its convergence.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据