☆ 3.8 Proceedings Paper

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) (2019)

期刊

2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA)

卷 -, 期 -, 页码 8943-8950

出版社

IEEE

DOI: 10.1109/icra.2019.8793485

关键词

类别

Automation & Control Systems Robotics

资金

JD.com American Technologies Corporation (JD) under the SAIL-JD AI Research Initiative
Toyota Research Institute (TRI)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. We evaluate our method on a peg insertion task, generalizing over different geometry, configurations, and clearances, while being robust to external perturbations. We present results in simulation and on a real robot.

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

期刊

2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

期刊

2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文