4.7 Article

Online Deep Reinforcement Learning for Computation Offloading in Blockchain-Empowered Mobile Edge Computing

期刊

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
卷 68, 期 8, 页码 8050-8062

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2019.2924015

关键词

Online computation offloading; blockchain; mobile edge computing; deep reinforcement learning

资金

  1. National Key Research and Development Plan [2018YFB1003803]
  2. National Natural Science Foundation of China [61802450, 61722214]
  3. Natural Science Foundation of Guangdong [2018A030313005]
  4. Program for Guangdong Introducing Innovative and Entrepreneurial Teams [2017ZT07X355]

向作者/读者索取更多资源

Offloading computation-intensive tasks (e.g., blockchain consensus processes and data processing tasks) to the edge/cloud is a promising solution for blockchain-empowered mobile edge computing. However, the traditional offloading approaches (e.g., auction-based and game-theory approaches) fail to adjust the policy according to the changing environment and cannot achieve long-term performance. Moreover, the existing deep reinforcement learning-based offloading approaches suffer from the slow convergence caused by high-dimensional action space. In this paper, we propose a new model-free deep reinforcement learning-based online computation offloading approach for blockchain-empowered mobile edge computing in which both mining tasks and data processing tasks are considered. First, we formulate the online offloading problem as a Markov decision process by considering both the blockchain mining tasks and data processing tasks. Then, to maximize long-term offloading performance, we leverage deep reinforcement learning to accommodate highly dynamic environments and address the computational complexity. Furthermore, we introduce an adaptive genetic algorithm into the exploration of deep reinforcement learning to effectively avoid useless exploration and speed up the convergence without reducing performance. Finally, our experimental results demonstrate that our algorithm can converge quickly and outperform three benchmark policies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据