☆ 4.7 Article

Bandwidth Allocation and Trajectory Control in UAV-Assisted IoV Edge Computing Using Multiagent Reinforcement Learning

IEEE TRANSACTIONS ON RELIABILITY (2023)

期刊

IEEE TRANSACTIONS ON RELIABILITY

卷 72, 期 2, 页码 599-608

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TR.2022.3192020

关键词

Attention mechanism; bandwidth assignment; location deployment; multiagent deep reinforcement learning (DRL); value decomposition network (VDN)

类别

Computer Science, Hardware & Architecture Computer Science, Software Engineering Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article investigates the scenario where multiple UAVs serve as edge computing devices for the Internet of Vehicles (IoV). By optimizing bandwidth allocation and trajectory control, the communication capacity of the system is maximized so that the UAV edge computing network can process more data. The proposed actor-critic mixing network (AC-Mix) and multi-attentive agent deep deterministic policy gradient (MA2DDPG) algorithms improve the performance compared to the benchmark algorithm MADDPG.

The rapid development of an unmanned aerial vehicle (UAV) has brought new opportunities for wireless communication and edge computing. In this article, we investigate the scenario where multiple UAVs serve as edge computing devices for the Internet of Vehicles (IoV). Regardless of the allocation of computing resources, we focus on bandwidth allocation and trajectory control to maximize the communication capacity of the system so that the UAV edge computing network can process more data. With this intent, a UAV-assisted IoV edge computing system model is constructed as a nonconvex optimization problem, aiming to maximize the achievable channel capacity of the network. To solve this problem, two quasi-distributed multiagent algorithms, i.e., actor-critic mixing network (AC-Mix) and multi-attentive agent deep deterministic policy gradient (MA2DDPG), are proposed based on deep deterministic policy gradient. Specifically, AC-Mix utilizes a mixing network to obtain a global Q-value for better evaluation of joint action, while MA2DDPG employs a multihead attention mechanism to achieve multiagent collaboration. Using multi-agents deep deterministic policy gradient (MADDPG) as benchmark, several experiments are carried out to verify the performance of the proposed algorithms. Simulation results show that the convergence velocity of AC-Mix and MA2DDPG is improved by 30.0% and 63.3%, respectively, compared with MADDPG.

Bandwidth Allocation and Trajectory Control in UAV-Assisted IoV Edge Computing Using Multiagent Reinforcement Learning

期刊

IEEE TRANSACTIONS ON RELIABILITY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Bandwidth Allocation and Trajectory Control in UAV-Assisted IoV Edge Computing Using Multiagent Reinforcement Learning

期刊

IEEE TRANSACTIONS ON RELIABILITY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文