4.7 Article

Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2020.3042943

Keywords

Games; Neural networks; Multi-agent systems; Markov processes; Training; Task analysis; Reinforcement learning; Deep deterministic policy gradient (DDPG); multiagent system; reinforcement learning (RL); StarCraft; stochastic environment

Funding

  1. National Science Foundation [CNS 1947418, ECCS 1947419]

Ask authors/readers for more resources

In this article, a novel semicentralized deep deterministic policy gradient (SCDDPG) algorithm is proposed for cooperative multiagent games. The algorithm utilizes a two-level actor-critic structure to facilitate interactions and cooperation among agents in StarCraft combat. The local and global actor-critic structures work together to generate optimal control actions and improve cooperation in the games.
In this article, we propose a novel semicentralized deep deterministic policy gradient (SCDDPG) algorithm for cooperative multiagent games. Specifically, we design a two-level actor-critic structure to help the agents with interactions and cooperation in the StarCraft combat. The local actor-critic structure is established for each kind of agents with partially observable information received from the environment. Then, the global actor-critic structure is built to provide the local design an overall view of the combat based on the limited centralized information, such as the health value. These two structures work together to generate the optimal control action for each agent and to achieve better cooperation in the games. Comparing with the fully centralized methods, this design can reduce the communication burden by only sending limited information to the global level during the learning process. Furthermore, the reward functions are also designed for both local and global structures based on the agents' attributes to further improve the learning performance in the stochastic environment. The developed method has been demonstrated on several scenarios in a real-time strategy game, i.e., StarCraft. The simulation results show that the agents can effectively cooperate with their teammates and defeat the enemies in various StarCraft scenarios.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available