Journal
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV
Volume 13716, Issue -, Pages 68-83Publisher
SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-031-26412-2_5
Keywords
Planning; Planning horizon; Reinforcement learning
Ask authors/readers for more resources
Planning is computationally expensive and limits the reactivity of autonomous agents. We propose a combination of classical planning and model-free reinforcement learning to reduce the planning horizon and adapt to the environment. Our framework has been evaluated on symbolic PDDL domains and a stochastic grid world environment, showing significant reduction in planning horizon and improved performance.
Planning is a computationally expensive process, which can limit the reactivity of autonomous agents. Planning problems are usually solved in isolation, independently of similar, previously solved problems. The depth of search that a planner requires to find a solution, known as the planning horizon, is a critical factor when integrating planners into reactive agents. We consider the case of an agent repeatedly carrying out a task from different initial states. We propose a combination of classical planning and model-free reinforcement learning to reduce the planning horizon over time. Control is smoothly transferred from the planner to the model-free policy as the agent compiles the planner's policy into a value function. Local exploration of the model-free policy allows the agent to adapt to the environment and eventually overcome model inaccuracies. We evaluate the efficacy of our framework on symbolic PDDL domains and a stochastic grid world environment and show that we are able to significantly reduce the planning horizon while improving upon model inaccuracies.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available