Journal
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Volume 33, Issue 4, Pages 1584-1593Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2020.3042943
Keywords
Games; Neural networks; Multi-agent systems; Markov processes; Training; Task analysis; Reinforcement learning; Deep deterministic policy gradient (DDPG); multiagent system; reinforcement learning (RL); StarCraft; stochastic environment
Categories
Funding
- National Science Foundation [CNS 1947418, ECCS 1947419]
Ask authors/readers for more resources
In this article, a novel semicentralized deep deterministic policy gradient (SCDDPG) algorithm is proposed for cooperative multiagent games. The algorithm utilizes a two-level actor-critic structure to facilitate interactions and cooperation among agents in StarCraft combat. The local and global actor-critic structures work together to generate optimal control actions and improve cooperation in the games.
In this article, we propose a novel semicentralized deep deterministic policy gradient (SCDDPG) algorithm for cooperative multiagent games. Specifically, we design a two-level actor-critic structure to help the agents with interactions and cooperation in the StarCraft combat. The local actor-critic structure is established for each kind of agents with partially observable information received from the environment. Then, the global actor-critic structure is built to provide the local design an overall view of the combat based on the limited centralized information, such as the health value. These two structures work together to generate the optimal control action for each agent and to achieve better cooperation in the games. Comparing with the fully centralized methods, this design can reduce the communication burden by only sending limited information to the global level during the learning process. Furthermore, the reward functions are also designed for both local and global structures based on the agents' attributes to further improve the learning performance in the stochastic environment. The developed method has been demonstrated on several scenarios in a real-time strategy game, i.e., StarCraft. The simulation results show that the agents can effectively cooperate with their teammates and defeat the enemies in various StarCraft scenarios.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available