☆ 4.6 Article

Deep Reinforcement Learning for Traffic Light Timing Optimization

PROCESSES (2022)

Journal

PROCESSES

Volume 10, Issue 11, Pages -

Publisher

MDPI

DOI: 10.3390/pr10112458

Keywords

traffic light control; deep reinforcement learning

Funding

National Key Research and Development Program of China
[2018YFB1003602]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a traffic light timing optimization method called EP-D3QN based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL). The method controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, leading to significant reductions in waiting and travel times for vehicles, and improving the efficiency of intersections.

Existing inflexible and ineffective traffic light control at a key intersection can often lead to traffic congestion due to the complexity of traffic dynamics, how to find the optimal traffic light timing strategy is a significant challenge. This paper proposes a traffic light timing optimization method based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL), namely EP-D3QN, which controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, whether the phase is switched based on the rules we set in advance and the pressure of the lane. In EP-D3QN, each intersection corresponds to an agent, and the road entering the intersection is divided into grids, each grid stores the speed and position of a car, thus forming the vehicle information matrix, and as the state of the agent. The action of the agent is a set of traffic light phase in a signal cycle, which has four values. The effective duration of the traffic lights is 0-60 s, and the traffic light phases switching depends on its press and the rules we set. The reward of the agent is the difference between the sum of the accumulated waiting time of all vehicles in two consecutive signal cycles. The SUMO is used to simulate two traffic scenarios. We selected two types of evaluation indicators and compared four methods to verify the effectiveness of EP-D3QN. The experimental results show that EP-D3QN has superior performance in light and heavy traffic flow scenarios, which can reduce the waiting time and travel time of vehicles, and improve the traffic efficiency of an intersection.

Deep Reinforcement Learning for Traffic Light Timing Optimization

Journal

PROCESSES

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Deep Reinforcement Learning for Traffic Light Timing Optimization

Journal

PROCESSES

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper