4.7 Article

Generating attentive goals for prioritized hindsight reinforcement learning

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 203, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2020.106140

Keywords

Attentive goals generation; Prioritized hindsight model; Hindsight experience replay; Reinforcement learning

Funding

  1. National Natural Science Foundation of China [61671175]
  2. Sichuan Science and Technology Program, China [2019YFS0069]
  3. Lab of Space Optoelectronic Measurement & Perception, China [LabSOMP-2018-01]

Ask authors/readers for more resources

Typical reinforcement learning (RL) performs a single task and does not scale to problems in which an agent must perform multiple tasks, such as moving a robot arm to different locations. The multigoal framework extends typical RL using a goal-conditional value function and policy, whereby the agent pursues different goals in different episodes. By treating a virtual goal as the desired one, and frequently giving the agent rewards, hindsight experience replay has achieved promising results in the sparse-reward setting of multi-goal RL. However, these virtual goals are uniformly sampled after the replay state from experiences, regardless of their significance. We propose a novel prioritized hindsight model for multi-goal RL in which the agent is provided with more valuable goals, as measured by the expected temporal-difference (TD) error. An attentive goals generation (AGG) network, which consists of temporal convolutions, multi-head dot product attentions, and a last-attention network, is structured to generate the virtual goals to replay. The AGG network is trained by following the gradient of TD-error calculated by an actor-critic model, and generates goals to maximize the expected TD-error with replay transitions. The whole network is fully differentiable and can be learned in an end-to-end manner. The proposed method is evaluated on several robotic manipulating tasks and demonstrates improved sample efficiency and performance. (C) 2020 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available