4.7 Article

Reinforcement Learning-Based Optimal Computing and Caching in Mobile Edge Network

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSAC.2020.3000396

关键词

Base stations; Optimization; Reinforcement learning; Bandwidth; Correlation; Markov processes; Throughput; Joint pushing and caching; deep reinforcement learning; mobile edge network

资金

  1. National Science Foundation of China (NSFC) [61831018, 61771345, 61901199, 61631017]
  2. Guangdong Province Key Research and Development Program Major Science and Technology Projects [2018B010115002]

向作者/读者索取更多资源

Joint pushing and caching are commonly considered an effective way to adapt to tidal effects in networks. However, the problem of how to precisely predict users' future requests and push or cache the proper content remains to be solved. In this paper, we investigate a joint pushing and caching policy in a general mobile edge computing (MEC) network with multiuser and multicast data. We formulate the joint pushing and caching problem as an infinite-horizon average-cost Markov decision process (MDP). Our aim is not only to maximize bandwidth utilization but also to decrease the total quantity of data transmitted. Then, a joint pushing and caching policy based on hierarchical reinforcement learning (HRL) is proposed, which considers both long-term file popularity and short-term temporal correlations of user requests to fully utilize bandwidth. To address the curse of dimensionality, we apply a divide-and-conquer strategy to decompose the joint base station and user cache optimization problem into two subproblems: the user cache optimization subproblem and the base station cache optimization subproblem. We apply value function approximation Q-learning and a deep Q-network (DQN) to solve these two subproblems. Furthermore, we provide some insights into the design of deep reinforcement learning in network caching. The simulation results show that the proposed policy can learn content popularity very well and predict users' future demands precisely. Our approach outperforms existing schemes on various parameters including the base station cache size, the number of users and the total number of files in multiple scenarios.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据