☆ 3.8 Proceedings Paper

EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) (2019)

期刊

2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)

卷 -, 期 -, 页码 5041-5048

出版社

IEEE

DOI: 10.1109/iros40897.2019.8968287

关键词

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems Computer Science, Theory & Methods Robotics

资金

SAIC Innovation Center, a subsidiary of SAIC Motors
AFRL
DARPA [FA8750-18-C-0099]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Although imitation learning is often used in robotics, the approach frequently suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by aggregating training data from both the expert and novice policies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which attempts to quantify the confidence of the novice policy as a proxy for safety. Our method, EnsembleDAgger, approximates a Gaussian Process using an ensemble of neural networks. Using the variance as a measure of confidence, we compute a decision rule that captures how much we doubt the novice, thus determining when it is safe to allow the novice to act. With this approach, we aim to maximize the novice's share of actions, while constraining the probability of failure. We demonstrate improved safety and learning performance compared to other DAgger variants and classic imitation learning on an inverted pendulum and in the MuJoCo HalfCheetah environment.

EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

期刊

2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

期刊

2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文