Journal
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
Volume 296, Issue 3, Pages 953-966Publisher
ELSEVIER
DOI: 10.1016/j.ejor.2021.04.030
Keywords
Dynamic programming; Risk-sensitive Markov decision process; Risk measure; Robustness
Ask authors/readers for more resources
This paper investigates risk-sensitive Markov Decision Processes with unbounded cost and finite/infinite planning horizons. By recursively applying static risk measures and making direct assumptions on model data, we derive a Bellman equation and prove the existence of optimal Markov policies. Additionally, our approach unifies results for various well-known risk measures and establishes a connection to distributionally robust MDPs.
In this paper, we consider risk-sensitive Markov Decision Processes (MDPs) with Borel state and action spaces and unbounded cost. We treat both finite and infinite planning horizons. Our optimality criterion is based on the recursive application of static risk measures. This is motivated by recursive utilities in the economic literature. It has been studied before for the entropic risk measure and is extended here to general static risk measures. Under direct assumptions on the model data we derive a Bellman equa-tion and prove the existence of optimal Markov policies. For an infinite planning horizon, the model is shown to be contractive and the optimal policy to be stationary. Our approach unifies results for a num-ber of well-known risk measures. Moreover, we establish a connection to distributionally robust MDPs, which provides a global interpretation of the recursively defined objective function. Monotone models are studied in particular. (c) 2021 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available