4.7 Article

Stochastic Approximation for Risk-Aware Markov Decision Processes

Journal

IEEE TRANSACTIONS ON AUTOMATIC CONTROL
Volume 66, Issue 3, Pages 1314-1320

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TAC.2020.2989702

Keywords

Markov decision processes (MDPs); risk measure; saddle point; stochastic approximation; Q-learning

Funding

  1. SRIBD International Postdoctoral Fellowship
  2. National Research Foundation, Prime Ministers Office, Singapore under its Campus for Research Excellence and Technological Enterprise program
  3. Singapore Ministry of Education Grant [R-266-000-083-133]
  4. Singapore Ministry of Education Tier II Grant [MOE2015-T2-2-148]

Ask authors/readers for more resources

A stochastic approximation algorithm was developed to solve risk-aware Markov decision processes, covering various risk measures and establishing almost sure convergence and convergence rate of the algorithm. The overall convergence rate of the algorithm was proven to be Omega((ln(1/delta epsilon)/epsilon(2))(1/k) + (ln(1/epsilon))(1/(1-k))) with probability at least 1-delta for a given error tolerance epsilon > 0 and learning rate k in the range (1/2, 1].
We develop a stochastic approximation-type algorithm to solve finite state/action, infinite-horizon, risk-aware Markov decision processes. Our algorithm has two loops. The inner loop computes the risk by solving a stochastic saddle-point problem. The outer loop performs Q-learning to compute an optimal risk-aware policy. Several widely investigated risk measures (e.g., conditional value-at-risk, optimized certainty equivalent, and absolute semideviation) are covered by our algorithm. Almost sure convergence and the convergence rate of the algorithm are established. For an error tolerance epsilon > 0 for optimal Q-value estimation gap and learning rate k is an element of (1/2, 1], the overall convergence rate of our algorithm is Omega((ln(1/delta epsilon)/epsilon(2))(1/k) + (ln(1/epsilon))(1/(1-k))) with probability at least 1-delta.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available