☆ 4.7 Article

Hierarchical and Stable Multiagent Reinforcement Learning for Cooperative Navigation Control

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Volume 34, Issue 1, Pages 90-103

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2021.3089834

Keywords

Navigation; Planning; Estimation; Mathematical model; Task analysis; Markov processes; Games; Cooperative navigation; hierarchical policy learning; multiagent deep reinforcement learning (MADRL)

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

In this article, we address an important and challenging problem of cooperative navigation control. We formulate the problem as a stochastic game and propose a hierarchical and stable multiagent deep reinforcement learning algorithm. Experimental results demonstrate that our method converges quickly and generates more efficient cooperative navigation policies compared to other methods.

We solve an important and challenging cooperative navigation control problem, Multiagent Navigation to Unassigned Multiple targets (MNUM) in unknown environments with minimal time and without collision. Conventional methods are based on multiagent path planning that requires building an environment map and expensive real-time path planning computations. In this article, we formulate MNUM as a stochastic game and devise a novel multiagent deep reinforcement learning (MADRL) algorithm to learn an end-to-end solution, which directly maps raw sensor data to control signals. Once learned, the policy can be deployed onto each agent, and thereby, the expensive online planning computations can be offloaded. However, to solve MNUM, traditional MADRL suffers from large policy solution space and nonstationary environment when agents make decisions independently and concurrently. Accordingly, we propose a hierarchical and stable MADRL algorithm. The hierarchical learning part introduces a two-layer policy model to reduce the solution space and uses an interlaced learning paradigm to learn two coupled policies. In the stable learning part, we propose to learn an extended action-value function that implicitly incorporates estimations of other agents' actions, based on which the environment's nonstationarity caused by other agents' changing policies can be alleviated. Extensive experiments demonstrate that our method can converge in a fast way and generate more efficient cooperative navigation policies than comparable methods.

Hierarchical and Stable Multiagent Reinforcement Learning for Cooperative Navigation Control

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Hierarchical and Stable Multiagent Reinforcement Learning for Cooperative Navigation Control

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper