4.6 Article

Safety Augmented Value Estimation From Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks

期刊

IEEE ROBOTICS AND AUTOMATION LETTERS
卷 5, 期 2, 页码 3612-3619

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LRA.2020.2976272

关键词

Task analysis; Heuristic algorithms; Cost function; Planning; Uncertainty; Robots; Trajectory; Reinforcement learning; imitation learning; optimal control

类别

资金

  1. Scalable Collaborative Human-Robot Learning (SCHooL) Project, a NSF National Robotics Initiative [1734633]
  2. NSF GRFP
  3. Office of Naval Research [N00014-311]
  4. Direct For Computer & Info Scie & Enginr [1734633] Funding Source: National Science Foundation
  5. Div Of Information & Intelligent Systems [1734633] Funding Source: National Science Foundation

向作者/读者索取更多资源

Reinforcement learning (RL) for robotics is challenging due to the difficulty in hand-engineering a dense cost function, which can lead to unintended behavior, and dynamical uncertainty, which makes exploration and constraint satisfaction challenging. We address these issues with a new model-based reinforcement learning algorithm, Safety Augmented Value Estimation from Demonstrations (SAVED), which uses supervision that only identifies task completion and a modest set of suboptimal demonstrations to constrain exploration and learn efficiently while handling complex constraints. We then compare SAVED with 3 state-of-the-art model-based and model-free RL algorithms on 6 standard simulation benchmarks involving navigation and manipulation and a physical knot-tying task on the daVinci surgical robot. Results suggest that SAVEDoutperforms priormethods in terms of success rate, constraint satisfaction, and sample efficiency, making it feasible to safely learn a control policy directly on a real robot in less than an hour. For tasks on the robot, baselines succeed less than 5% of the time while SAVED has a success rate of over 75% in the first 50 training iterations. Code and supplementary material is available at https://tinyurl.com/ saved-rl.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据