4.7 Article

Twin delayed deep deterministic policy gradient-based intelligent computation offloading for IoT

Journal

DIGITAL COMMUNICATIONS AND NETWORKS
Volume 9, Issue 4, Pages 836-845

Publisher

KEAI PUBLISHING LTD
DOI: 10.1016/j.dcan.2022.06.008

Keywords

Fog computing; Computation offloading; Deep reinforcement learning; Resource allocation

Ask authors/readers for more resources

In this paper, an efficient and intelligent computation offloading mechanism with resource allocation is studied for the randomness distribution of multiple users in the dynamic large-scale IoT scenario. An optimization problem is formulated to minimize the total energy consumption of all tasks, and a TD3PG-ICO algorithm is proposed to solve this problem. The simulation results show that the proposed algorithm has faster convergence speed and good robustness, with the ability to reduce total energy consumption compared to other schemes.
In view of the randomness distribution of multiple users in the dynamic large-scale Internet of Things (IoT) scenario, comprehensively formulating available resources for fog nodes in the area and achieving computation services at low cost have become great challenges. As a result, this paper studies an efficient and intelligent computation offloading mechanism with resource allocation. Specifically, an optimization problem is formulated to minimize the total energy consumption of all tasks under the joint optimization of computation offloading decisions, bandwidth resources and transmission power. Meanwhile, a Twin Delayed Deep Deterministic Policy Gradient-based Intelligent Computation Offloading (TD3PG-ICO) algorithm is proposed to solve this optimization problem. By combining the concept of the actor critic algorithm, the proposed algorithm designs two independent critic networks that can avoid the subjective prediction of a single critic network and better guide the policy network to generate the global optimal computation offloading policy. Additionally, this algorithm introduces a continuous variable discretization operation to select the target offloading node with random probability. The available resources of the target node are dynamically allocated to improve the model decision-making effect. Finally, the simulation results show that this proposed algorithm has faster convergence speed and good robustness. It can always approach the greedy algorithm with respect to the lowest total energy consumption. Furthermore, compared with full local and Deep Q-learning Network (DQN)-based computation offloading schemes, the total energy consumption can be reduced by an average of 15.53% and 6.41%, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available