4.8 Article

Multiagent Deep-Reinforcement-Learning-Based Resource Allocation for Heterogeneous QoS Guarantees for Vehicular Networks

Journal

IEEE INTERNET OF THINGS JOURNAL
Volume 9, Issue 3, Pages 1683-1695

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2021.3089823

Keywords

Resource management; Quality of service; Optimization; Reinforcement learning; Training; Entertainment industry; Copper; Deep reinforcement learning (DRL); heterogeneous applications; multi-agent deep deterministic policy gradient (MADDPG); resource allocation

Funding

  1. Project of International Cooperation and Exchanges NSFC [61860206005]
  2. National Natural Science Foundation of China [61801278, 61972237]
  3. Key Laboratory of Cognitive Radio and Information Processing, Ministry of Education (Guilin University of Electronic Technology) [CRKL190205]
  4. Shandong Provincial Scientific Research Programs in Colleges and Universities [J18KA310]

Ask authors/readers for more resources

In this article, a multi-agent deep reinforcement learning-based resource allocation framework is proposed to satisfy the heterogeneous QoS requirements in vehicular networks. The framework combines centralized learning and decentralized execution to optimize channel allocation and power control.
Vehicle-to-vehicle communications can offer direct information interaction, including security-centered information and entertainment information. However, the rapid proliferation of vehicles and the diversity of communications services demand for a more intelligent and efficient resource allocation framework to enhance network performance. In this article, a multi-agent deep reinforcement learning-based resource allocation framework is developed to jointly optimize the channel allocation and power control to satisfy the heterogeneous Quality-of-Service (QoS) requirements in heterogeneous vehicular networks. In the proposed framework, the utility maximization problem is formulated by considering two types of traffics, i.e., the strict ultrareliable and low-latency requirements for safety-centric applications and the high-capacity requirements for entertainment applications. The utility of each vehicular users is formulated as a multicriterion objective function by taking into account the heterogeneous traffic requirements. To overcome the drawbacks of the traditional totally centralized and distributed deep reinforcement learning-based resource allocation approaches, we propose a multi-agent deep deterministic policy gradient algorithm with centralized learning and decentralized execution to solve the formulated optimization problem. The normalization of the input states and reward functions is introduced to speed up the training and learning progress of the proposed algorithm. Simulation results show the superiority of the proposed algorithm in terms of the convergence and system performance through the comparison with the other methods and schemes for the delay-sensitive applications and delay-tolerant applications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available