☆ 4.7 Article

UAV-Enabled Secure Communications by Multi-Agent Deep Reinforcement Learning

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY (2020)

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

卷 69, 期 10, 页码 11599-11611

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TVT.2020.3014788

关键词

UAV; multi-agent deep reinforcement learning; trajectory design; policy gradient; physical layer security

类别

Engineering, Electrical & Electronic Telecommunications Transportation Science & Technology

资金

National Key Research and Development Program of China [2018AAA0102401]
National Natural Science Foundation of China [61831013, 61771274, 61531011, 61871321]
Beijing Municipal Natural Science Foundation [4182030, L182042]
US NSF [EARS-1839818, CNS1717454, CNS-1731424, CNS-1702850]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Unmanned aerial vehicles (UAVs) can be employed as aerial base stations to support communication for the ground users (GUs). However, the aerial-to-ground (A2G) channel link is dominated by line-of-sight (LoS) due to the high flying altitude, which is easily wiretapped by the ground eavesdroppers (GEs). In this case, a single UAV has limited maneuvering capacity to obtain the desired secure rate in the presence of multiple eavesdroppers. In this paper, we propose a cooperative jamming approach by letting UAV jammers help the UAV transmitter defend against GEs. To be specific, the UAV transmitter sends the confidential information to GUs, and the UAV jammers send the artificial noise signals to the GEs by 3D beamforming. We propose a multi-agent deep reinforcement learning (MADRL) approach, i.e., multi-agent deep deterministic policy gradient (MADDPG) to maximize the secure capacity by jointly optimizing the trajectory of UAVs, the transmit power from UAV transmitter and the jamming power from the UAV jammers. The MADDPG algorithm adopts centralized training and distributed execution. The simulation results show the MADRL method can realize the joint trajectory design of UAVs and achieve good performance. To improve the learning efficiency and convergence, we further propose a continuous action attention MADDPG (CAA-MADDPG) method, where the agent learns to pay attention to the actions and observations of other agents that are more relevant with it. From the simulation results, the rewards performance of CAA-MADDPG is better than the MADDPG without attention.

UAV-Enabled Secure Communications by Multi-Agent Deep Reinforcement Learning

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

UAV-Enabled Secure Communications by Multi-Agent Deep Reinforcement Learning

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文