4.7 Article

Game Combined Multi-Agent Reinforcement Learning Approach for UAV Assisted Offloading

Journal

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Volume 70, Issue 12, Pages 12888-12901

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2021.3121281

Keywords

Games; Time division multiple access; Relays; Task analysis; Collision avoidance; Trajectory optimization; Delays; Unmanned aerial vehicle; offloading; trajectory optimization; potential game; multi-agent deep reinforcement learning; energy efficiency; obstacle avoidance

Funding

  1. China and Shaanxi Postdoctoral Science Foundation [2017M623243, 2018BSHYDZZ26]
  2. Shaanxi and Guangxi Keypoint Research and Invention Program [2019ZDLGY13-02-02, AB19110036]
  3. Taicang Keypoint Science and Technology Plan [TC2018SF03, TC2019SF03]

Ask authors/readers for more resources

The research proposes an approach to optimize multiple UAVs' trajectory by combining potential games and multi-agent deep deterministic policy gradient (MADDPG) methods, taking into account GUs' data offloading delay, energy efficiency, and obstacle avoidance system.
Air ground integrated mobile cloud computing (MCC) provides unmanned aerial vehicles (UAVs) the capability to act as an aerial relay with more flexibility and resilience. In the cloud computing architecture, the data generated by ground users (GUs) can be offloaded to the remote server for fast processing. However, the heterogeneity of mobile tasks makes the data size distributed among GUs unbalanced. Besides, the energy efficiency of UAVs movement should be carefully considered for sustainable flight and obstacle avoidance. In general, such a joint trajectory issue can hardly be formulated as a convex optimization in unpredictable and dynamic environments. This paper proposes a potential game combined multi-agent deep deterministic policy gradient (MADDPG) approach to optimize multiple UAVs' trajectory with the consideration of GUs' offloading delay, energy efficiency as well as obstacle avoidance system. In specific, we first model the issue as a mixed integer non-linear problem (MINP), in which the service assignment between multi-user and multi-UAV is solved by potential game. The convergence to a Nash Equilibrium (NE) can be achieved by distributive service assignment update with infinite iteration. Then, we optimize the trajectory with obstacle avoidance at each UAV by MADDPG approach, which has a great advantage of centralized-training and decentralized-execution to reduce the global synchronized communication overhead. UAVs movement can be optimized in continuity rather than other deep reinforcement learning (DRL) approaches generating discrete simple actions. Experiments demonstrate the proposed game-combined learning algorithm can minimize the offloading delay, enhance UAVs' energy efficiency and avoid the obstacles at the same time.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available