4.6 Article

Collaborative Decision-Making Method for Multi-UAV Based on Multiagent Reinforcement Learning

Journal

IEEE ACCESS
Volume 10, Issue -, Pages 91385-91396

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3199070

Keywords

Collaboration; Decision making; Autonomous aerial vehicles; Prediction algorithms; Task analysis; Training data; Heuristic algorithms; Reinforcement learning; Multi-agent systems; UAV; multi-UAV; collaborative decision-making; multi-agent reinforcement learning

Funding

  1. National Natural Science Foundation of China [61903014]
  2. Aeronautical Science Foundation of China [20200017051001]

Ask authors/readers for more resources

This article investigates the collaborative mission capability and decision-making problem of multi-UAVs and proposes a reinforcement learning algorithm to address this issue. Each UAV is considered as an actor that collects data in a decentralized manner, while a centralized critic provides evaluation information. By introducing gate recurrent units and attention mechanism, the algorithm can learn better decision-making strategies in complex environments.
The collaborative mission capability of multi-UAV has received more and more attention in recent years as the research on multi-UAV theories and applications has intensified. The artificial intelligence technology integrated into the multi-UAV collaborative decision-making system can effectively improve the collaborative mission capability of multi-UAV. We propose a multi-agent reinforcement learning algorithm for multi-UAV collaborative decision-making. Our approach is based on the actor-critic algorithm, where each UAV is treated as an actor that collects data decentralized in the environment. A centralized critic provides evaluation information for each training step during the centralized training of these actors. We introduce a gate recurrent unit in the actor to enable the UAV to make reasonable decisions concerning historical decision information. Moreover, we use an attention mechanism to design the centralized critic, which can achieve better learning in a complex environment. Finally, the algorithm is trained and experimented in a multi-UAV air combat scenario developed in the collaborative decision-making environment. The experimental results show that our approach can learn collaborative decision-making strategies with excellent performance, while convergence performance is better compared to other algorithms.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available