☆ 4.6 Article

Planning-Augmented Hierarchical Reinforcement Learning

IEEE ROBOTICS AND AUTOMATION LETTERS (2021)

Journal

IEEE ROBOTICS AND AUTOMATION LETTERS

Volume 6, Issue 3, Pages 5097-5104

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/LRA.2021.3071062

Keywords

Machine learning for robot control; Motion and path planning; Reinforcement learning

Funding

Wallenberg AI, Autonomous Systems, and Software Program (WASP) - Knut, and AliceWallenberg Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study introduces a novel algorithm called PAHRL, which combines planning algorithms and reinforcement learning to address problems with implicitly defined goals by dividing tasks into shorter MDPs. During testing, a planner determines useful subgoals on the state graph constructed at the bottom level, showcasing the effectiveness of this approach in solving long-horizon decision-making problems.

Planning algorithms are powerful at solving long-horizon decision-making problems but require that environment dynamics are known. Model-free reinforcement learning has recently been merged with graph-based planning to increase the robustness of trained policies in state-space navigation problems. Recent ideas suggest to use planning in order to provide intermediate waypoints guiding the policy in long-horizon tasks. Yet, it is not always practical to describe a problem in the setting of state-to-state navigation. Often, the goal is defined by one or multiple disjoint sets of valid states or implicitly using an abstract task description. Building upon previous efforts, we introduce a novel algorithm called Planning-Augmented Hierarchical Reinforcement Learning (PAHRL) which translates the concept of hybrid planning/RL to such problems with implicitly defined goal. Using a hierarchical framework, we divide the original task, formulated as a Markov Decision Process (MDP), into a hierarchy of shorter horizon MDPs. Actor-critic agents are trained in parallel for each level of the hierarchy. During testing, a planner then determines useful subgoals on a state graph constructed at the bottom level of the hierarchy. The effectiveness of our approach is demonstrated for a set of continuous control problems in simulation including robot arm reaching tasks and the manipulation of a deformable object.

Planning-Augmented Hierarchical Reinforcement Learning

Journal

IEEE ROBOTICS AND AUTOMATION LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Planning-Augmented Hierarchical Reinforcement Learning

Journal

IEEE ROBOTICS AND AUTOMATION LETTERS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper