4.6 Article

Deep Reinforcement Learning With Discrete Normalized Advantage Functions for Resource Management in Network Slicing

Journal

IEEE COMMUNICATIONS LETTERS
Volume 23, Issue 8, Pages 1337-1341

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LCOMM.2019.2922961

Keywords

Network slicing; resource management; deep reinforcement learning

Funding

  1. National Natural Science Foundation of China [61701439, 61731002]
  2. Zhejiang Key RD Plan [2019C01002]
  3. Fundamental Research Funds for the Central Universities

Ask authors/readers for more resources

Network slicing promises to provision diversified services with distinct requirements in one infrastructure. Deep reinforcement learning (e.g., deep Q-learning, DQL) is assumed to be an appropriate algorithm to solve the demand-aware inter-slice resource management issue in network slicing by regarding the varying demands and the allocated bandwidth as the environment state and the action, respectively. However, allocating bandwidth in a finer resolution usually implies larger action space, and unfortunately DQL fails to quickly converge in this case. In this letter, we introduce discrete normalized advantage functions (DNAF) into DQL, by separating the Q-value function as a state-value function term and an advantage term and exploiting a deterministic policy gradient descent (DPGD) algorithm to avoid the unnecessary calculation of Q-value for every state-action pair. Furthermore, as DPGD only works in continuous action space, we embed a k-nearest neighbor algorithm into DQL to quickly find a valid action in the discrete space nearest to the DPGD output. Finally, we verify the faster convergence of the DNAF-based DQL through extensive simulations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available