4.6 Article

Deep Reinforcement Learning for Traffic Light Timing Optimization

Journal

PROCESSES
Volume 10, Issue 11, Pages -

Publisher

MDPI
DOI: 10.3390/pr10112458

Keywords

traffic light control; deep reinforcement learning

Funding

  1. National Key Research and Development Program of China
  2. [2018YFB1003602]

Ask authors/readers for more resources

This paper proposes a traffic light timing optimization method called EP-D3QN based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL). The method controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, leading to significant reductions in waiting and travel times for vehicles, and improving the efficiency of intersections.
Existing inflexible and ineffective traffic light control at a key intersection can often lead to traffic congestion due to the complexity of traffic dynamics, how to find the optimal traffic light timing strategy is a significant challenge. This paper proposes a traffic light timing optimization method based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL), namely EP-D3QN, which controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, whether the phase is switched based on the rules we set in advance and the pressure of the lane. In EP-D3QN, each intersection corresponds to an agent, and the road entering the intersection is divided into grids, each grid stores the speed and position of a car, thus forming the vehicle information matrix, and as the state of the agent. The action of the agent is a set of traffic light phase in a signal cycle, which has four values. The effective duration of the traffic lights is 0-60 s, and the traffic light phases switching depends on its press and the rules we set. The reward of the agent is the difference between the sum of the accumulated waiting time of all vehicles in two consecutive signal cycles. The SUMO is used to simulate two traffic scenarios. We selected two types of evaluation indicators and compared four methods to verify the effectiveness of EP-D3QN. The experimental results show that EP-D3QN has superior performance in light and heavy traffic flow scenarios, which can reduce the waiting time and travel time of vehicles, and improve the traffic efficiency of an intersection.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available