☆ 4.6 Article

Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward

NEUROCOMPUTING (2019)

Journal

NEUROCOMPUTING

Volume 360, Issue -, Pages 265-278

Publisher

ELSEVIER

DOI: 10.1016/j.neucom.2019.06.024

Keywords

Hierarchical reinforcement learning; Automatic curriculum learning; Sparse reward reinforcement learning; Sample-efficient reinforcement learning

Funding

NSFC [61876095, 61751308]
Beijing Natural Science Foundation [L172037]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Mastering the sparse reward or long-horizon task is critical but challenging in reinforcement learning. To tackle this problem, we propose a hierarchical automatic curriculum learning framework (HACL), which intrinsically motivates the agent to hierarchically and progressively explore environments. The agent is equipped with a target area during training. As the target area progressively grows, the agent learns to explore from near to far, in a curriculum fashion. The pseudo target-achieving reward converts the sparse reward into dense reward, thus the long-horizon difficulty is alleviated. The whole system makes hierarchical decisions, in which a high-level conductor travels through different targets, and a low-level executor operates in the original action space to complete the instructions given by the high-level conductor. Unlike many existing works that manually set curriculum training phases, in HACL, the total curriculum training process is automated and suits the agent's current exploration capability. Extensive experiments on three sparse reward tasks, long-horizon stochastic chain, grid maze, and the challenging Atari game Montezuma's Revenge, show that HACL achieves comparable or even better performance but with significantly less training frames. (C) 2019 Elsevier B.V. All rights reserved.

Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Hierarchical automatic curriculum learning: Converting a sparse reward navigation task into dense reward

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper