☆ 4.7 Article

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY (2021)

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

卷 70, 期 1, 页码 600-612

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TVT.2020.3047800

关键词

Trajectory; Wireless communication; Three-dimensional displays; Propagation losses; Interference; Heuristic algorithms; Downlink; Capacity; constrained markov decision process (CMDP); deep reinforcement learning (DRL); trajectory design; unmanned aerial vehicles (UAVs)

类别

Engineering, Electrical & Electronic Telecommunications Transportation Science & Technology

资金

Beijing Municipal Natural Science Foundation-Haidian [L182037]
Beijing Municipal Science and Technology [Z181100003218015]
Beijing Natural Science Foundation [L192032]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study investigates the effective trajectory design of multiple UAVs to enhance communication system capacity, utilizing a Deep Q Network algorithm to maximize real-time downlink capacity under coverage constraints while ensuring all ground terminals are covered.

The effective trajectory design of multiple unmanned aerial vehicles (UAVs) is investigated for improving the capacity of the communication system. The aim is for maximizing real-time downlink capacity under the coverage constraint by reaping the mobility benefits of UAVs. The problem of three-dimension (3D) dynamic movement of UAVs under coverage constraint is formulated as a Constrained Markov Decision Process (CMDP) problem, while a constrained Deep Q-Network (cDQN) algorithm is proposed for solving the formulated problem. In the proposed cDQN model, each UAV acts as an agent to explore and learn its 3D deploying policy. The aim of the proposed cDQN model is for obtaining the maximum capacity while attempting to guarantee that all ground terminals (GTs) are covered. In order to satisfy the coverage constraint, a primal-dual method is adopted for training primal variable and dual variable (lagrangian multiplier) in turn. Furthermore, in an effort to reduce the action space of the cDQN algorithm, prior information is utilized for eliminating the invalid actions by the action filter. Experiment results demonstrate that the cDQN algorithm is capable of converging after some training steps. Additionally, the UAVs are capable of adapting the movement of GTs under the coverage constraint according to the 3D deploying policy derived from the proposed cDQN algorithm.

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Three-Dimension Trajectory Design for Multi-UAV Wireless Network With Deep Reinforcement Learning

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文