4.7 Article

Federated Multi-Agent Deep Reinforcement Learning for Resource Allocation of Vehicle-to-Vehicle Communications

Journal

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Volume 71, Issue 8, Pages 8810-8824

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2022.3173057

Keywords

Resource management; Delays; Reliability; Reinforcement learning; Interference; Transmitters; Training; V2V communication; resource allocation; deep reinforcement learning; dueling double deep Q-network (D3QN); federated learning

Funding

  1. National Natural Science Foundation of China [61771002]
  2. Fundamental Research Funds for the Central Universities [2021CZ102]

Ask authors/readers for more resources

This paper investigates a novel approach for resource allocation in vehicle-to-vehicle (V2V) communications, utilizing federated multi-agent deep reinforcement learning and federated learning. The proposed method optimizes both channel selection and power control, ensuring reliability and delay requirements of V2V communication while maximizing the transmit rates of cellular links.
Dynamic topology, fast-changing channels and the time sensitivity of safety-related services present challenges to the status quo of resource allocation for cellular-underlaying vehicle-to-vehicle (V2V) communications. In this paper, we investigate a novel federated multi-agent deep reinforcement learning (FedMARL) approach for the decentralized joint optimization of channel selection and power control for V2V communication. The approach takes advantage of both deep reinforcement learning (DRL) and federated learning (FL), satisfying the reliability and delay requirements of V2V communication while maximizing the transmit rates of cellular links. Specifically, we elaborately construct individual V2V agent implement by the dueling double deep Q-network (D3QN), and design the reward function to train V2V agents collaboratively. As a result, each agent individually optimizes channel selection and power level based on its local observations, including the instantaneous channel state information (CSI) of corresponding V2V link, the instantaneous co-channel interference from the cellular link, the previous channels selections of nearby V2V pairs, and the queue backlog at the V2V transmitter. Another important aspect is that we incorporate FL to alleviate the training instability problem induced by cooperative multi-agent environment. The local DRL models of different V2V agents are federated periodically, addressing the limitations of partial observability on the entire network status for individual agent, and accelerating the training process of multi-agent learning. Validated via simulations, the proposed FedMARL scheme shows superiority to the baselines in terms of the cellular sum-rate and the V2V packet delivery rate.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available