Journal
IEEE ROBOTICS AND AUTOMATION LETTERS
Volume 6, Issue 3, Pages 5097-5104Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LRA.2021.3071062
Keywords
Machine learning for robot control; Motion and path planning; Reinforcement learning
Categories
Funding
- Wallenberg AI, Autonomous Systems, and Software Program (WASP) - Knut, and AliceWallenberg Foundation
Ask authors/readers for more resources
This study introduces a novel algorithm called PAHRL, which combines planning algorithms and reinforcement learning to address problems with implicitly defined goals by dividing tasks into shorter MDPs. During testing, a planner determines useful subgoals on the state graph constructed at the bottom level, showcasing the effectiveness of this approach in solving long-horizon decision-making problems.
Planning algorithms are powerful at solving long-horizon decision-making problems but require that environment dynamics are known. Model-free reinforcement learning has recently been merged with graph-based planning to increase the robustness of trained policies in state-space navigation problems. Recent ideas suggest to use planning in order to provide intermediate waypoints guiding the policy in long-horizon tasks. Yet, it is not always practical to describe a problem in the setting of state-to-state navigation. Often, the goal is defined by one or multiple disjoint sets of valid states or implicitly using an abstract task description. Building upon previous efforts, we introduce a novel algorithm called Planning-Augmented Hierarchical Reinforcement Learning (PAHRL) which translates the concept of hybrid planning/RL to such problems with implicitly defined goal. Using a hierarchical framework, we divide the original task, formulated as a Markov Decision Process (MDP), into a hierarchy of shorter horizon MDPs. Actor-critic agents are trained in parallel for each level of the hierarchy. During testing, a planner then determines useful subgoals on a state graph constructed at the bottom level of the hierarchy. The effectiveness of our approach is demonstrated for a set of continuous control problems in simulation including robot arm reaching tasks and the manipulation of a deformable object.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available