4.7 Article

Reinforcement learning-based cost-efficient service function chaining with CoMP zero-forcing beamforming in edge networks

Publisher

ELSEVIER
DOI: 10.1016/j.future.2022.11.022

Keywords

Service function chaining; Mobile edge computing; Actor-critic; Zero -forcing beamforming; Proximal center dual decomposition

Ask authors/readers for more resources

This paper investigates the online service function chaining (SFC) deployment in the edge of 6G wireless systems using artificial intelligence and reinforcement learning. It proposes a coordinated multiple points (CoMP)-based zero-forcing beamforming method to cancel interference between SFCs. It also introduces a natural gradient-based actor-critic framework to model edge network dynamics and train neural networks to the global optimum.
As two promising paradigms in emerging 6G wireless systems, service function chaining (SFC) and mobile edge computing (MEC) have attracted insensitive attentions from both industry and academia, and would bring more close-proximity services to 6G users with communication, computing and caching (3C) resources, yet also faced with challenges arising in time-varying channel conditions and resource dynamics. In this work, boosted by recent advents in artificial intelligence and reinforcement learning, we investigate the on-line SFC deployment in the edge of 6G wireless systems via the actor- critic learning framework. First, one long-run cost-efficient SFC deployment problem is investigated, and the coordinated multiple points (CoMP)-based zero-forcing beamforming is utilized to cancel the interference across SFCs. Then, by exploiting the Markov decision processes (MDP) property of long-run SFC deployment, one natural gradient-based actor-critic framework is proposed to characterize edge network dynamics, and meanwhile facilitates the training of neural networks to the global optimum. Next, to lower the size of action space, we follow the principle that a subproblem is embedded into each state-action pair's critic to solve the reward function, and then utilize both the lp (0 < p < 1) norm-based successive convex approximation (SCA) and proximal center-based dual decomposition to approach the global optimum and accelerate the convergence. Finally, numerical results are used to validate proposed actor-critic approach, showing that the communication resource management deserves special attentions in the SFC deployment in the edge of 6G wireless systems.(c) 2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available