☆ 4.2 Article Proceedings Paper

An analysis of model-based Interval Estimation for Markov Decision Processes

JOURNAL OF COMPUTER AND SYSTEM SCIENCES (2008)

期刊

JOURNAL OF COMPUTER AND SYSTEM SCIENCES

卷 74, 期 8, 页码 1309-1331

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.jcss.2007.08.009

关键词

Reinforcement learning; Learning theory; Markov Decision Processes

类别

Computer Science, Hardware & Architecture Computer Science, Theory & Methods

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents a theoretical analysis of MBIE and a new variation called MBIE-EB, proving their efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less online cousins from the literature. (C) 2008 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

主要评分

4.2

评分不足

An analysis of model-based Interval Estimation for Markov Decision Processes

期刊

JOURNAL OF COMPUTER AND SYSTEM SCIENCES

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An analysis of model-based Interval Estimation for Markov Decision Processes

期刊

JOURNAL OF COMPUTER AND SYSTEM SCIENCES

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文