4.7 Article

Distributional and hierarchical reinforcement learning for physical systems with noisy state observations and exogenous perturbations

Journal

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2023.106465

Keywords

Hierarchical reinforcement learning; Distributional reinforcement learning; Noisy state observation; Exogenous perturbation

Ask authors/readers for more resources

This paper introduces a novel approach that combines hierarchical reinforcement learning and distributional reinforcement learning to address complex sparse-reward tasks. The proposed method models random rewards as random variables following a value distribution, and uses a hierarchical policy structure. The results demonstrate the effectiveness of this method in handling uncertainties caused by noise and perturbations, and it shows potential for developing more robust and effective reinforcement learning algorithms in real physical systems.
Reinforcement learning has shown remarkable success in various applications, and in some cases, even out-performs human performance. However, despite the potential of reinforcement learning, numerous challenges still exist. In this paper, we introduce a novel approach that exploits the synergies between hierarchical reinforcement learning and distributional reinforcement learning to address complex sparse-reward tasks, where noisy state observations or non-stationary exogenous perturbations are present. Our proposed method has a hierarchical policy structure, where random rewards are modeled as random variables that follow a value distribution. This approach enables the handling of complex tasks and increases robustness to uncertainties arising from measurement noise or exogenous perturbations, such as wind. To achieve this, we extend the distributional soft Bellman operator and temporal difference error to include the hierarchical structure, and we use quantile regression to approximate the reward distribution. We evaluate our method using a bipedal robot in the OpenAI Gym environment and an electric autonomous vehicle in the SUMO traffic simulator. The results demonstrate the effectiveness of our approach in solving complex tasks with the aforementioned uncertainties when compared to state-of-the-art methods. Our approach demonstrates promising results in handling uncertainties caused by noise and perturbations for challenging sparse-reward tasks, and could potentially pave the way for the development of more robust and effective reinforcement learning algorithms in real physical systems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available