☆ 4.7 Article

ME-MADDPG: An efficient learning-based motion planning method for multiple agents in complex environments

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS (2022)

Journal

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS

Volume 37, Issue 3, Pages 2393-2427

Publisher

WILEY

DOI: 10.1002/int.22778

Keywords

deep reinforcement learning; MADDPG; motion planning; multiagent

Funding

National Natural Science Foundation of China [62003267]
Natural Science Foundation of Shaanxi Province [2020JQ-220]
Science and Technology on Electronic Information Control Laboratory [JS20201100339]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a new ME-MADDPG algorithm to improve the efficiency and adaptability of multiagent motion planning methods by introducing a mixed experience strategy. Experimental results demonstrate that the proposed algorithm significantly enhances convergence speed and effectiveness in training compared to traditional MADDPG, showing better performance in complex dynamic environments.

Developing efficient motion policies for multiagents is a challenge in a decentralized dynamic situation, where each agent plans its own paths without knowing the policies of the other agents involved. This paper presents an efficient learning-based motion planning method for multiagent systems. It adopts the framework of multiagent deep deterministic policy gradient (MADDPG) to directly map partially observed information to motion commands for multiple agents. To improve the efficiency of MADDPG in sample utilization, so as to train more brilliant agents that can adapt to more complex environments, a strategy named mixed experience (ME) is introduced to MADDPG, and this has led to our proposed ME-MADDPG algorithm. The novel ME strategy can be embodied into three specific mechanisms: (1) an artificial potential field-based sample generator to produce high-quality samples in the early training stage; (2) a dynamic mixed sampling strategy to mix the training data from different sources with a variable proportion; (3) a delayed learning skill to stabilize the training of the multiple agents. A series of experiments have been conducted to verify the performance of the proposed ME-MADDPG algorithm, and it has been demonstrated that, compared with MADDPG, the proposed algorithm can significantly improve the convergence speed and convergence effect in the training process, and it has also shown better efficiency and better adaptability in complex dynamic environments while it is used for multiagent motion planning applications.

ME-MADDPG: An efficient learning-based motion planning method for multiple agents in complex environments

Journal

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

ME-MADDPG: An efficient learning-based motion planning method for multiple agents in complex environments

Journal

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper