4.4 Article

Deep-reinforcement-learning-based gait pattern controller on an uneven terrain for humanoid robots

期刊

出版社

TAYLOR & FRANCIS INC
DOI: 10.1080/15599612.2023.2222146

关键词

Gait pattern generator; humanoid robots; deep reinforcement learning; PPO2

向作者/读者索取更多资源

In this study, reinforcement learning was used to train humanoid robots to adapt to uneven terrains and automatically adjust their parameters for optimal gait pattern control. The results showed that proximal policy optimization (PPO), combining advantage actor-critic and trust region policy optimization, was the most suitable method. An improved version of PPO, called PPO2, was used in combination with data preprocessing methods such as wavelet transform and fuzzification, which improved the gait pattern control and balance of humanoid robots.
Although conventional gait pattern control in humanoid robots is typically performed on flat terrains, the roads that people walk on every day have bumps and potholes. Therefore, to make humanoid robots more similar to humans, the movement parameters of these robots should be modified to allow them to adapt to uneven terrains. In this study, to solve this problem, reinforcement learning (RL) was used to allow humanoid robots to engage in self-training and automatically adjust their parameters for ultimate gait pattern control. However, RL has multiple types, and each type has its own benefits and shortcomings. Therefore, a series of experiments were performed, and the results indicated that proximal policy optimization (PPO), combining advantage actor-critic and trust region policy optimization, was the most suitable method. Hence, an improved version of PPO, called PPO2, was used, and the experimental results indicated that the combination of deep RL with data preprocessing methods, such as wavelet transform and fuzzification, facilitated the gait pattern control and balance of humanoid robots.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据