4.7 Article

Q-Learning With Uniformly Bounded Variance

Related references

Note: Only part of the references are listed.
Article Statistics & Probability

EFFECTIVE BERRY-ESSEEN AND CONCENTRATION BOUNDS FOR MARKOV CHAINS WITH A SPECTRAL GAP

Benoit Kloeckner

ANNALS OF APPLIED PROBABILITY (2019)

Article Operations Research & Management Science

Q-learning and policy iteration algorithms for stochastic shortest path problems

Huizhen Yu et al.

ANNALS OF OPERATIONS RESEARCH (2013)

Article Automation & Control Systems

A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning

David Choi et al.

DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS (2006)

Article Mathematics, Applied

The compact law of the iterated logarithm for multivariate stochastic approximation algorithms

A Mokkadem et al.

STOCHASTIC ANALYSIS AND APPLICATIONS (2005)

Article Statistics & Probability

A law of the iterated logarithm for stochastic approximation procedures in d-dimensional Euclidean space

V Koval et al.

STOCHASTIC PROCESSES AND THEIR APPLICATIONS (2003)

Article Statistics & Probability

Hoeffding's inequality for uniformly ergodic Markov chains

PW Glynn et al.

STATISTICS & PROBABILITY LETTERS (2002)

Article Automation & Control Systems

Learning algorithms or Markov decision processes with average cost

J Abounadi et al.

SIAM JOURNAL ON CONTROL AND OPTIMIZATION (2001)

Article Automation & Control Systems

The ODE method for convergence of stochastic approximation and reinforcement learning

VS Borkar et al.

SIAM JOURNAL ON CONTROL AND OPTIMIZATION (2000)