4.8 Article

Deep-Reinforcement-Learning-Based Offloading Scheduling for Vehicular Edge Computing

Journal

IEEE INTERNET OF THINGS JOURNAL
Volume 7, Issue 6, Pages 5449-5465

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2020.2978830

Keywords

Task analysis; Servers; Processor scheduling; Schedules; Internet of Things; Edge computing; Wireless communication; Computation offloading; deep reinforcement learning (DRL); mobile-edge computing (MEC); task scheduling; vehicular edge computing (VEC)

Funding

  1. National Natural Science Foundation of China [61871096, 61972075]
  2. National Key Research and Development Program of China [2018YFB2101300]
  3. EU H2020 Research and Innovation Programme under the Marie Sklodowska-Curie [752979]
  4. China Scholarship Council
  5. Marie Curie Actions (MSCA) [752979] Funding Source: Marie Curie Actions (MSCA)

Ask authors/readers for more resources

Vehicular edge computing (VEC) is a new computing paradigm that has great potential to enhance the capability of vehicle terminals (VTs) to support resource-hungry in-vehicle applications with low latency and high energy efficiency. In this article, we investigate an important computation offloading scheduling problem in a typical VEC scenario, where a VT traveling along an expressway intends to schedule its tasks waiting in the queue to minimize the long-term cost in terms of a tradeoff between task latency and energy consumption. Due to diverse task characteristics, dynamic wireless environment, and frequent handover events caused by vehicle movements, an optimal solution should take into account both where to schedule (i.e., local computation or offloading) and when to schedule (i.e., the order and time for execution) each task. To solve such a complicated stochastic optimization problem, we model it by a carefully designed Markov decision process (MDP) and resort to deep reinforcement learning (DRL) to deal with the enormous state space. Our DRL implementation is designed based on the state-of-the-art proximal policy optimization (PPO) algorithm. A parameter-shared network architecture combined with a convolutional neural network (CNN) is utilized to approximate both policy and value function, which can effectively extract representative features. A series of adjustments to the state and reward representations are taken to further improve the training efficiency. Extensive simulation experiments and comprehensive comparisons with six known baseline algorithms and their heuristic combinations clearly demonstrate the advantages of the proposed DRL-based offloading scheduling method.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available