4.5 Article

Interest-Driven Exploration With Observational Learning for Developmental Robots

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCDS.2021.3057758

关键词

Robots; Robot kinematics; Task analysis; Probabilistic logic; Training; Manipulators; Knowledge based systems; Developmental robot; direct inverse kinematics learning from observation; intrinsic motivation; learning; online learning; robot model learning; socially guided exploration

向作者/读者索取更多资源

This article proposes a novel extrinsic-intrinsic motivation learning scheme to accelerate learning by combining intrinsic motivation with learning from observation. The scheme includes four elements: 1) a probabilistic intrinsic motivation signal to spark the robot's interest; 2) a probabilistic extrinsic motivation signal to expand the robot's knowledge through learning from observation; 3) novelty detection; and 4) novelty degree methods for the robot to autonomously decide how and when to explore.
It has been emphasized for a long time that real-world applications of developmental robots require lifelong and online learning. A major challenge in this field is the high sample-complexity of algorithms, which has led to the development of intrinsic motivation approaches to render learning more efficient. However, only few works have been demonstrated on real robots and although these robots are supposed to share the environment with humans, there is hardly any research to integrate intrinsic motivation with learning from an interacting teacher. In this article, we tackle the efficiency challenge by proposing a novel extrinsic-intrinsic motivation learning scheme. We specifically investigate how to combine intrinsic motivation with learning from observation to accelerate learning. Our novel scheme comprises four elements: 1) a probabilistic intrinsic motivation signal yielding the robot's interest; 2) a probabilistic extrinsic motivation signal to expand the robot's knowledge by learning from observation; 3) novelty detection; and 4) novelty degree methods to enable the robot to decide autonomously how and when to explore. The efficiency as well as the applicability of our methods are benchmarked in simulation experiments and demonstrated on a physical 7-degree of freedom left arm of Baxter robot.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据