☆ 4.2 Article Proceedings Paper

An analysis of model-based Interval Estimation for Markov Decision Processes

JOURNAL OF COMPUTER AND SYSTEM SCIENCES (2008)

Journal

JOURNAL OF COMPUTER AND SYSTEM SCIENCES

Volume 74, Issue 8, Pages 1309-1331

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.jcss.2007.08.009

Keywords

Reinforcement learning; Learning theory; Markov Decision Processes

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents a theoretical analysis of MBIE and a new variation called MBIE-EB, proving their efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less online cousins from the literature. (C) 2008 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2

Not enough ratings

An analysis of model-based Interval Estimation for Markov Decision Processes

Journal

JOURNAL OF COMPUTER AND SYSTEM SCIENCES

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

An analysis of model-based Interval Estimation for Markov Decision Processes

Journal

JOURNAL OF COMPUTER AND SYSTEM SCIENCES

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper