4.7 Article

User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach

Journal

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS
Volume 17, Issue 1, Pages 680-692

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TWC.2017.2769644

Keywords

User scheduling; resource allocation; energy efficiency; renewable energy; actor-critic reinforcement learning

Funding

  1. National Natural Science Foundation of China [61571059]
  2. U.S. NSF [CNS-1717454, CNS-1731424, CNS-1702850, CNS-1646607, ECCS-1547201, CMMI-1434789, CNS-1443917, ECCS-1405121]

Ask authors/readers for more resources

Densely deployment of various small-cell base stations in cellular networks to increase capacity will lead to heterogeneous networks (HetNets), and meanwhile, embedding the energy harvesting capabilities in base stations as an alternative energy supply is becoming a reality. How to make efficient utilization of radio resource and renewable energy is a brand-new challenge. This paper investigates the optimal policy for user scheduling and resource allocation in HetNets powered by hybrid energy with the purpose of maximizing energy efficiency of the overall network. Since wireless channel conditions and renewable energy arrival rates have stochastic properties and the environment's dynamics are unknown, the model-free reinforcement learning approach is used to learn the optimal policy through interactions with the environment. To solve our problem with continuous-valued state and action variables, a policy-gradient-based actor-critic algorithm is proposed. The actor part uses the Gaussian distribution as the parameterized policy to generate continuous stochastic actions, and the policy parameters are updated with the gradient ascent method. The critic part uses compatible function approximation to estimate the performance of the policy and helps the actor learn the gradient of the policy. The advantage function is used to further reduce the variance of the policy gradient. Using the numerical simulations, we demonstrate the convergence property of the proposed algorithm and analyze network energy efficiency.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available