☆ 4.7 Article

Multi-Agent Reinforcement Learning Based 3D Trajectory Design in Aerial-Terrestrial Wireless Caching Networks

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY (2021)

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

卷 70, 期 8, 页码 8201-8215

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TVT.2021.3094273

关键词

Trajectory; Device-to-device communication; Wireless communication; Cache storage; Wireless sensor networks; Unmanned aerial vehicles; Throughput; Unmanned aerial vehicles (UAVs); trajectory design; wireless caching; multi-agent reinforcement learning

类别

Engineering, Electrical & Electronic Telecommunications Transportation Science & Technology

资金

Ministry of Science and Technology, Taiwan [MOST 108-2218-E-008 -016 -MY2]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper investigates a dynamic 3D trajectory design of multiple cache-enabled UAVs in a wireless D2D caching network with the goal of maximizing long-term network throughput. The proposed multi-agent reinforcement learning framework outperforms traditional single- and multi-agent Q-learning algorithms in determining optimal UAV trajectories. Our work confirms the feasibility and effectiveness of cache-enabled UAVs as an important complement to terrestrial D2D caching nodes.

This paper investigates a dynamic 3D trajectory design of multiple cache-enabled unmanned aerial vehicles (UAVs) in a wireless device-to-device (D2D) caching network with the goal of maximizing the long-term network throughput. By storing popular content at the nearby mobile user devices, D2D caching is an efficient method to improve network throughput and alleviate backhaul burden. With the attractive features of high mobility and flexible deployment, UAVs have recently attracted significant attention as cache-enabled flying base stations. The use of cache-enabled UAVs opens up the possibility of tracking the mobility pattern of the corresponding users and serving them under limited cache storage capacity. However, it is challenging to determine the optimal UAV trajectory due to the dynamic environment with frequently changing network topology and the coexistence of aerial and terrestrial caching nodes. In response, we propose a novel multi-agent reinforcement learning based framework to determine the optimal 3D trajectory of each UAV in a distributed manner without a central coordinator. In the proposed method, multiple UAVs can cooperatively make flight decisions by sharing the gained experiences within a certain proximity to each other. Simulation results reveal that our algorithm outperforms the traditional single- and multi-agent Q-learning algorithms. This work confirms the feasibility and effectiveness of cache-enabled UAVs which serve as an important complement to terrestrial D2D caching nodes.

Multi-Agent Reinforcement Learning Based 3D Trajectory Design in Aerial-Terrestrial Wireless Caching Networks

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-Agent Reinforcement Learning Based 3D Trajectory Design in Aerial-Terrestrial Wireless Caching Networks

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文