4.7 Article

Task offloading in Multiple-Services Mobile Edge Computing: A deep reinforcement learning algorithm

期刊

COMPUTER COMMUNICATIONS
卷 202, 期 -, 页码 1-12

出版社

ELSEVIER
DOI: 10.1016/j.comcom.2023.02.001

关键词

Mobile edge computing; Service caching; Task offloading; Resource allocation; Deep reinforcement learning

向作者/读者索取更多资源

Multiple-Services Mobile Edge Computing allows dynamic updates of cached services in edge servers, enabling task offloading to improve system performance. However, the dynamic nature of service requirements, computing demands, and data transfer poses a challenge in adapting the subset of service types and making resource allocation decisions. To solve this, a deep reinforcement learning-based algorithm called DSOR is proposed, which converts the problem into a Markov decision process and jointly determines service caching, task offloading, bandwidth allocation, and computing resource allocation.
Multiple-Services Mobile Edge Computing enables task-relate services cached in edge server to be dynamically updated, and thus provides great opportunities to offload tasks to edge server for execution. However, the requirements and popularity of services, the computing requirement and the amount of data transferred from users to edge server are dynamic with time. How to adaptively adjust the subset of total service types in the resource-limited edge server and determine the task offloading destination and resource allocation decisions to improve the overall system performance is a challenge problem. To solve this challenge, we firstly convert it into a Markov decision process, then propose a soft actor-critic deep reinforcement learning-based algorithm, called DSOR, to jointly determine not only the discrete decisions of service caching and task offloading but also the continuous allocation of bandwidth and computing resource. To improve the accuracy of our algorithm, we employ an efficient trick of converting the discrete action selection into a continuous space to deal with the key design challenge that arises from continuous-discrete hybrid action space. Additionally, to improve resource utilization, a novel reward function is integrated to our algorithm to speed up the convergence of training while making full use of system resources. Extensive numerical results show that compared with other baseline algorithms, our algorithm can effectively reduce the long-term average completion delay of tasks while accessing excellent performance in terms of stability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据