☆ 4.3 Article

Goal-driven active learning

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS (2021)

Journal

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS

Volume 35, Issue 2, Pages -

Publisher

SPRINGER

DOI: 10.1007/s10458-021-09527-5

Keywords

Deep reinforcement learning; Imitation learning; Goal-conditioned learning; Active learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. By introducing the concept of active goal-driven demonstrations and a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized, the method outperforms prior imitation learning approaches in terms of exploration efficiency and average scores in most tasks, as demonstrated by experimental results in various benchmark environments from the Mujoco domain.

Deep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch-when the learner's goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.

Goal-driven active learning

Journal

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Goal-driven active learning

Journal

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper