4.7 Article

Cooperative traffic signal control using Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph algorithm

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 183, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2019.07.026

Keywords

Cooperative traffic signal control; Coordination graph algorithm; Multiagent deep reinforcement learning; Transfer planning; Asynchronous Advantage Actor-Critic (A3C) algorithm

Funding

  1. Sichuan Science and Technology Program, China [2019YJ0164]
  2. Research Grants Council of the Hong Kong Special Administrative Region, China [11300715]
  3. City University of Hong Kong [7005055]

Ask authors/readers for more resources

Intelligent traffic signal control helps to reduce traffic congestion and thus has been studied for a few decades. Multi-intersection cooperative traffic signal control (CTSC), which is more practical than single-intersection traffic signal control, has attracted much attention and research in recent years. Existing works on multi-intersection CTSC make responsive policies based on the sequence of agents' actions. One issue in multi-intersection CTSC is that every agent's actions are mapped from its own road information and some useful information, e.g., the distance of adjacent agents, is ignored, which may lead to suboptimal traffic signal control policies. To address this issue, in this paper a decentralized coordination graph algorithm, referred to as Multi-step return and Off-policy Asynchronous Advantage Actor-Critic Graph (MOA3CG) algorithm, is proposed. The MOA3CG algorithm is based on an asynchronous method of multiagent deep reinforcement learning and a coordination graph; the proposed algorithm makes traffic signal control policies based on current traffic states, the history of observations and other information. A new reward function and An Adjusting Matrix of Traffic Signal Phase Control (AMTSPC) are proposed, which are used by the MOA3CG algorithm in the policy-making process; the AMTSPC is to alter selection of actions by considering the distance of adjacent agents. Experimental results on real-world road scenarios show that the proposed algorithm outperforms other four state-of-the-art algorithms in terms of average delay, average traveling time of vehicles, and the throughput of vehicles, thus eventually helps to mitigate traffic congestion. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available