4.2 Article Proceedings Paper

An analysis of model-based Interval Estimation for Markov Decision Processes

期刊

JOURNAL OF COMPUTER AND SYSTEM SCIENCES
卷 74, 期 8, 页码 1309-1331

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jcss.2007.08.009

关键词

Reinforcement learning; Learning theory; Markov Decision Processes

向作者/读者索取更多资源

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents a theoretical analysis of MBIE and a new variation called MBIE-EB, proving their efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less online cousins from the literature. (C) 2008 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据