☆ 4.7 Article

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

IEEE TRANSACTIONS ON MOBILE COMPUTING (2023)

期刊

IEEE TRANSACTIONS ON MOBILE COMPUTING

卷 22, 期 7, 页码 4056-4069

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TMC.2022.3146881

关键词

Navigation; Task analysis; Training; Stochastic processes; Base stations; Ad hoc networks; Autonomous aerial vehicles; UAV control; communication coverage; deep reinforcement learning; graph learning; stochastic policy

类别

Computer Science, Information Systems Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this paper, a deep reinforcement learning (DRL) based control solution is proposed to navigate a swarm of unmanned aerial vehicles (UAVs) in an unexplored target area under partial observation, serving as Mobile Base Stations (MBSs) for optimal communication coverage. A novel network architecture called Deep Recurrent Graph Network (DRGN) is introduced to handle information loss and obtain spatial information through inter-UAV communication. Based on DRGN and maximum-entropy learning, a stochastic DRL policy named Soft Deep Recurrent Graph Network (SDRGN) is proposed. Extensive experiments demonstrate the superior performance and scalability of SDRGN compared to state-of-the-art approaches.

In this paper, we aim to design a deep reinforcement learning (DRL) based control solution to navigating a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under partial observation, which serves as Mobile Base Stations (MBSs) providing optimal communication coverage for the ground mobile users. To handle the information loss caused by the partial observability, we introduce a novel network architecture named Deep Recurrent Graph Network (DRGN), which could obtain extra spatial information through graph-convolution based inter-UAV communication, and utilize historical features with a recurrent unit. Based on DRGN and maximum-entropy learning, we propose a stochastic DRL policy named Soft Deep Recurrent Graph Network (SDRGN). In SDRGN, a heuristic reward function is elaborated, which is based on the local information of each UAV instead of the global information; thus, SDRGN reduces the training cost and enables distributed online learning. We conducted extensive experiments to design the structure of DRGN and examine the performance of SDRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and three heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of SDRGN.

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

期刊

IEEE TRANSACTIONS ON MOBILE COMPUTING

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

期刊

IEEE TRANSACTIONS ON MOBILE COMPUTING

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文