4.7 Article

Multi-Agent Deep Reinforcement Learning Based Downlink Beamforming in Heterogeneous Networks

期刊

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS
卷 22, 期 6, 页码 4247-4263

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TWC.2022.3224150

关键词

Heterogeneous network; beamforming; deep reinforcement learning; multi-agent

向作者/读者索取更多资源

In this paper, a beamforming framework is proposed that utilizes multi-agent deep reinforcement learning (DRL) to maximize the system downlink sum-rate in a heterogeneous network (HetNet). Each access point (AP) acts as an agent and generates a beamforming vector based on local observations, which is then evaluated for its appropriateness. The proposed framework converges fast and outperforms benchmark beamforming methods in terms of the system downlink sum-rate performance.
We consider a heterogeneous network (HetNet), where multiple access points (APs) of potentially different transmission capacities serve users simultaneously via beamforming in the same spectrum band. We propose a beamforming framework that exploits multi-agent deep reinforcement learning (DRL) for the HetNet to maximize the system downlink sum-rate. In our framework, each AP acts as an agent, which is equipped with an online policy deep neural network (DNN) and an online Q-function DNN. The former generates an AP's beamforming vector based only on local observations in a time slot, while the latter evaluates the appropriateness of this beamforming vector. We present a distributed-updating-centralized-rewarding scheme to train the policy DNNs and Q-function DNNs of all the APs in an online trial-and-error way. Under this scheme, all the APs take the system downlink sum-rate in a recent time slot (informed by a central controller) as their identical one-step reward. Trained by the experience items with centralized rewards in every time slot, the weight vectors of each AP's local DNNs will be updated in the direction to the global optimum. Simulation results demonstrate that the proposed framework converges fast and outperforms the benchmark beamforming methods in terms of the system downlink sum-rate performance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据