☆ 4.6 Article

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

NEUROCOMPUTING (2023)

Journal

NEUROCOMPUTING

Volume 537, Issue -, Pages 141-151

Publisher

ELSEVIER

DOI: 10.1016/j.neucom.2023.03.054

Keywords

Reinforcement learning; Hybrid action space; SAC; Visual perception

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes an action-decoupled algorithm for hybrid action space, which abstracts the original agent into two agents containing only discrete or continuous action space. The discrete and continuous actions are executed simultaneously. The proposed algorithm, named AD-SAC, is optimized using Soft Actor-Critic (SAC) algorithm. Experimental results show that our algorithm outperforms other algorithms in terms of convergence and robustness in UAV path planning and gimbal scanning tasks.

Most existing Deep Reinforcement Learning (DRL) algorithms solely apply to discrete action or continu-ous action spaces. However, the agent often has both continuous and discrete action space, named hybrid action space. This paper proposes an action-decoupled algorithm for hybrid action space. Specifically, the hybrid action is decoupled, and then the original agent in the hybrid action space is abstracted into two agents. Each agent contains only discrete or continuous action space. The discrete and continuous actions are independent of each other to be executed simultaneously. We use the Soft Actor-Critic (SAC) algo-rithm as the optimization method and name our proposed algorithm Action Decoupled SAC (AD-SAC). We handle multi-agent problems using a framework of Centralized Training Decentralized Execution (CTDE) and then reduce the concatenation of partial agent observations to avoid the interference of redundant observations. We design a hybrid action space environment for Unmanned Aerial Vehicles (UAVs) path planning and gimbal scanning using AirSim. The results show that our algorithm has better convergence and robustness than the discretization, relaxation, and the Parametrized Deep Q-Networks Learning (P-DQN) algorithms. Finally, we carried out a Hardware in the Loop (HITL) simulation experi-ment based on Pixhawk to verify the feasibility of our algorithm.(c) 2023 Published by Elsevier B.V.

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper