4.7 Article

Learning-Based Computation Offloading Approaches in UAVs-Assisted Edge Computing

Journal

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Volume 70, Issue 1, Pages 928-944

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2020.3048938

Keywords

Task analysis; Resource management; Edge computing; Reinforcement learning; Heuristic algorithms; Vehicle dynamics; Time factors; Bandwidth allocation; computation offloading; inter-dependencies; multi-agent reinforcement learning; UAV

Funding

  1. Natural Science Foundation of China [61671295]
  2. National Fundamental Research Key Project of China [JCKY2017203B082]

Ask authors/readers for more resources

This paper proposes a UAVs-assisted computation offloading paradigm, modeling the problem of average mission response time minimization as a Markov decision process and applying multi-agent reinforcement learning algorithms to determine the target helper and bandwidth allocation. The proposed MARL-based approaches demonstrate desirable convergence properties and outperform benchmark approaches by significantly reducing average mission response time.
Technological evolutions in unmanned aerial vehicle (UAV) industry have granted UAVs more computing and storage resources, leading to the vision of UAVs-assisted edge computing, in which the computing missions can be offloaded from a cellular network to a UAV cloudlet. In this paper, we propose a UAVs-assisted computation offloading paradigm, where a group of UAVs fly around, while providing value-added edge computing services. The complex computing missions are decomposed as some typical task-flows with inter-dependencies. By taking into consideration the inter-dependencies of the tasks, dynamic network states, and energy constraints of the UAVs, we formulate the average mission response time minimization problem and then model it as a Markov decision process. Specifically, each time a mission arrives or a task execution finishes, we should decide the target helper for the next task execution and the fraction of the bandwidth allocated to the communication. To separate the evaluation of the integrated decision, we propose multi-agent reinforcement learning (MARL) algorithms, where the target helper and the bandwidth allocation are determined by two agents. We design respective advantage evaluation functions for the agents to solve the multi-agent credit assignment challenge, and further extend the on-policy algorithm to off-policy. Simulation results show that the proposed MARL-based approaches have desirable convergence property, and can adapt to the dynamic environment. The proposed approaches can significantly reduce the average mission response time compared with other benchmark approaches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available