4.5 Article

Hybrid autonomous controller for bipedal robot balance with deep reinforcement learning and pattern generators

Journal

ROBOTICS AND AUTONOMOUS SYSTEMS
Volume 146, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.robot.2021.103891

Keywords

Bipedal robot; Pattern generator; Reinforcement learning; Hybrid controller

Funding

  1. Engineering and Physical Sciences Research Council (EPSRC) Center for Doctoral Training in Embedded Intelligence (CDT-EI) [EP/L014998/1]

Ask authors/readers for more resources

The research proposed a hybrid autonomous controller that hierarchically combines two separate systems for bipedal robots to recover in close collaboration with humans in real-world applications. By combining hardcoded and reinforcement learning controllers, a balance of speed and adaptability is achieved, allowing the system to maintain efficient control in new dynamic environments.
Recovering after an abrupt push is essential for bipedal robots in real-world applications within environments where humans must collaborate closely with robots. There are several balancing algorithms for bipedal robots in the literature, however most of them either rely on hard coding or power-hungry algorithms. We propose a hybrid autonomous controller that hierarchically combines two separate, efficient systems, to address this problem. The lower-level system is a reliable, high-speed, full state controller that was hardcoded on a microcontroller to be power efficient. The higher-level system is a low-speed reinforcement learning controller implemented on a low-power onboard computer. While one controller offers speed, the other provides trainability and adaptability. An efficient control is then formed without sacrificing adaptability to new dynamic environments. Additionally, as the higher-level system is trained via deep reinforcement learning, the robot could learn after deployment, which is ideal for real-world applications. The system's performance is validated with a real robot recovering after a random push in less than 5 s, with minimal steps from its initial positions. The training was conducted using simulated data. (C) 2021 The Author(s). Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available