☆ 4.7 Article

Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2020)

期刊

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

卷 95, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.engappai.2020.103869

关键词

Continual learning; Curriculum learning; Hierarchical learning; Reservoir computing; Fractal network

类别

Automation & Control Systems Computer Science, Artificial Intelligence Engineering, Multidisciplinary Engineering, Electrical & Electronic

资金

JSPS KAKENHI, Japan [17K12759]
Grants-in-Aid for Scientific Research [17K12759] Funding Source: KAKEN

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

End-to-end reinforcement learning is a promising approach to enable robots to acquire complicated skills. However, this requires numerous samples to be implemented successfully. The issue is that it is often difficult to collect the sufficient number of samples. To accelerate learning in the field of robotics, knowledge gathered from robotics engineering and previously learned tasks must be fully exploited. Specifically, we propose using a sample-efficient curriculum to establish quadrupedal robot control in which the walking and turning tasks are divided into two hierarchical layers, and a robot learns them incrementally from lower to upper layers. To develop such a curriculum, two core components are designed. First the fractal design of neural networks in reservoir computing is aimed at allocating the tasks to be learned to respective modules in fractal networks. This allows mitigating the problem of catastrophic forgetting in neural networks and achieves the capability of continuous learning. The second task includes hierarchical task decomposition according to robotics knowledge for controlling legged robots. Owing to the combination of these two components, the proposed curriculum enables a robot to tune the lower layer even when the upper layer is optimized. As a result of implementing the proposed design, we confirm that a quadrupedal robot in a dynamical simulator succeeds in learning skills hierarchically according to the given curriculum, starting from moving legs and finally, walking/turning, unlike the considered conventional curriculums that are unable to achieve such results.

Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum

期刊

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reinforcement learning for quadrupedal locomotion with design of continual-hierarchical curriculum

期刊

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文