4.7 Article

Task offloading in Multiple-Services Mobile Edge Computing: A deep reinforcement learning algorithm

Journal

COMPUTER COMMUNICATIONS
Volume 202, Issue -, Pages 1-12

Publisher

ELSEVIER
DOI: 10.1016/j.comcom.2023.02.001

Keywords

Mobile edge computing; Service caching; Task offloading; Resource allocation; Deep reinforcement learning

Ask authors/readers for more resources

Multiple-Services Mobile Edge Computing allows dynamic updates of cached services in edge servers, enabling task offloading to improve system performance. However, the dynamic nature of service requirements, computing demands, and data transfer poses a challenge in adapting the subset of service types and making resource allocation decisions. To solve this, a deep reinforcement learning-based algorithm called DSOR is proposed, which converts the problem into a Markov decision process and jointly determines service caching, task offloading, bandwidth allocation, and computing resource allocation.
Multiple-Services Mobile Edge Computing enables task-relate services cached in edge server to be dynamically updated, and thus provides great opportunities to offload tasks to edge server for execution. However, the requirements and popularity of services, the computing requirement and the amount of data transferred from users to edge server are dynamic with time. How to adaptively adjust the subset of total service types in the resource-limited edge server and determine the task offloading destination and resource allocation decisions to improve the overall system performance is a challenge problem. To solve this challenge, we firstly convert it into a Markov decision process, then propose a soft actor-critic deep reinforcement learning-based algorithm, called DSOR, to jointly determine not only the discrete decisions of service caching and task offloading but also the continuous allocation of bandwidth and computing resource. To improve the accuracy of our algorithm, we employ an efficient trick of converting the discrete action selection into a continuous space to deal with the key design challenge that arises from continuous-discrete hybrid action space. Additionally, to improve resource utilization, a novel reward function is integrated to our algorithm to speed up the convergence of training while making full use of system resources. Extensive numerical results show that compared with other baseline algorithms, our algorithm can effectively reduce the long-term average completion delay of tasks while accessing excellent performance in terms of stability.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available