☆ 4.6 Article

Learning Bipedal Walking for Humanoids With Current Feedback

IEEE ACCESS (2023)

期刊

IEEE ACCESS

卷 11, 期 -, 页码 82013-82023

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2023.3301175

关键词

Robots; Legged locomotion; Humanoid robots; Torque; Foot; Actuators; Training; Deep learning; Reinforcement learning; Bipedal locomotion; humanoid robots; reinforcement learning; sim2real

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Recent advances in deep reinforcement learning combined with simulation training have provided a new approach for developing robust controllers for legged robots. However, applying these approaches to life-sized humanoid robots has been limited due to a large gap between simulation and reality. In this paper, the authors propose a method to overcome the sim2real gap issue by training in a simulated environment and utilizing torque feedback from the actuators on the real robot. The approach successfully achieves bipedal locomotion on a real HRP-5P humanoid robot.

Recent advances in deep reinforcement learning (RL) based techniques combined with training in simulation have offered a new approach to developing robust controllers for legged robots. However, the application of such approaches to real hardware has largely been limited to quadrupedal robots with direct-drive actuators and light-weight bipedal robots with low gear-ratio transmission systems. Application to real, life-sized humanoid robots has been less common arguably due to a large sim2real gap. In this paper, we present an approach for effectively overcoming the sim2real gap issue for humanoid robots arising from inaccurate torque-tracking at the actuator level. Our key idea is to utilize the current feedback from the actuators on the real robot, after training the policy in a simulation environment artificially degraded with poor torque-tracking. Our approach successfully trains a unified, end-to-end policy in simulation that can be deployed on a real HRP-5P humanoid robot to achieve bipedal locomotion. Through ablations, we also show that a feedforward policy architecture combined with targeted dynamics randomization is sufficient for zero-shot sim2real success, thus eliminating the need for computationally expensive, memory-based network architectures. Finally, we validate the robustness of the proposed RL policy by comparing its performance against a conventional model-based controller for walking on uneven terrain with the real robot.

Learning Bipedal Walking for Humanoids With Current Feedback

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Learning Bipedal Walking for Humanoids With Current Feedback

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文