☆ 4.7 Article

User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS (2018)

Journal

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS

Volume 17, Issue 1, Pages 680-692

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TWC.2017.2769644

Keywords

User scheduling; resource allocation; energy efficiency; renewable energy; actor-critic reinforcement learning

Funding

National Natural Science Foundation of China [61571059]
U.S. NSF [CNS-1717454, CNS-1731424, CNS-1702850, CNS-1646607, ECCS-1547201, CMMI-1434789, CNS-1443917, ECCS-1405121]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Densely deployment of various small-cell base stations in cellular networks to increase capacity will lead to heterogeneous networks (HetNets), and meanwhile, embedding the energy harvesting capabilities in base stations as an alternative energy supply is becoming a reality. How to make efficient utilization of radio resource and renewable energy is a brand-new challenge. This paper investigates the optimal policy for user scheduling and resource allocation in HetNets powered by hybrid energy with the purpose of maximizing energy efficiency of the overall network. Since wireless channel conditions and renewable energy arrival rates have stochastic properties and the environment's dynamics are unknown, the model-free reinforcement learning approach is used to learn the optimal policy through interactions with the environment. To solve our problem with continuous-valued state and action variables, a policy-gradient-based actor-critic algorithm is proposed. The actor part uses the Gaussian distribution as the parameterized policy to generate continuous stochastic actions, and the policy parameters are updated with the gradient ascent method. The critic part uses compatible function approximation to estimate the performance of the policy and helps the actor learn the gradient of the policy. The advantage function is used to further reduce the variance of the policy gradient. Using the numerical simulations, we demonstrate the convergence property of the proposed algorithm and analyze network energy efficiency.

User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach

Journal

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach

Journal

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper