Journal
NEUROCOMPUTING
Volume 450, Issue -, Pages 119-128Publisher
ELSEVIER
DOI: 10.1016/j.neucom.2021.04.015
Keywords
Model-based reinforcement learning; Markov decision process; Continuous control; Delayed system
Categories
Funding
- China Scholarship Council
Ask authors/readers for more resources
Action delays can reduce the performance of reinforcement learning in real-world systems. This paper introduces a formal definition of delay-aware Markov Decision Process and presents a delay-aware model-based reinforcement learning framework. Experimental results demonstrate that the proposed algorithm is more efficient in training and transferable between systems with different durations of delay compared to state-of-the-art model-free reinforcement learning methods.
Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the learned system models without learning effort. Experiments with the Gym and MuJoCo platforms show that the proposed delay-aware model-based algorithm is more efficient in training and transferable between systems with various durations of delay compared with state-of-the-art model-free reinforce-ment learning methods. (c) 2021 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available