期刊
IEEE TRANSACTIONS ON RELIABILITY
卷 72, 期 2, 页码 599-608出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TR.2022.3192020
关键词
Attention mechanism; bandwidth assignment; location deployment; multiagent deep reinforcement learning (DRL); value decomposition network (VDN)
This article investigates the scenario where multiple UAVs serve as edge computing devices for the Internet of Vehicles (IoV). By optimizing bandwidth allocation and trajectory control, the communication capacity of the system is maximized so that the UAV edge computing network can process more data. The proposed actor-critic mixing network (AC-Mix) and multi-attentive agent deep deterministic policy gradient (MA2DDPG) algorithms improve the performance compared to the benchmark algorithm MADDPG.
The rapid development of an unmanned aerial vehicle (UAV) has brought new opportunities for wireless communication and edge computing. In this article, we investigate the scenario where multiple UAVs serve as edge computing devices for the Internet of Vehicles (IoV). Regardless of the allocation of computing resources, we focus on bandwidth allocation and trajectory control to maximize the communication capacity of the system so that the UAV edge computing network can process more data. With this intent, a UAV-assisted IoV edge computing system model is constructed as a nonconvex optimization problem, aiming to maximize the achievable channel capacity of the network. To solve this problem, two quasi-distributed multiagent algorithms, i.e., actor-critic mixing network (AC-Mix) and multi-attentive agent deep deterministic policy gradient (MA2DDPG), are proposed based on deep deterministic policy gradient. Specifically, AC-Mix utilizes a mixing network to obtain a global Q-value for better evaluation of joint action, while MA2DDPG employs a multihead attention mechanism to achieve multiagent collaboration. Using multi-agents deep deterministic policy gradient (MADDPG) as benchmark, several experiments are carried out to verify the performance of the proposed algorithms. Simulation results show that the convergence velocity of AC-Mix and MA2DDPG is improved by 30.0% and 63.3%, respectively, compared with MADDPG.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据