4.6 Article

Deep Reinforcement Learning for Traffic Light Timing Optimization

期刊

PROCESSES
卷 10, 期 11, 页码 -

出版社

MDPI
DOI: 10.3390/pr10112458

关键词

traffic light control; deep reinforcement learning

资金

  1. National Key Research and Development Program of China
  2. [2018YFB1003602]

向作者/读者索取更多资源

This paper proposes a traffic light timing optimization method called EP-D3QN based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL). The method controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, leading to significant reductions in waiting and travel times for vehicles, and improving the efficiency of intersections.
Existing inflexible and ineffective traffic light control at a key intersection can often lead to traffic congestion due to the complexity of traffic dynamics, how to find the optimal traffic light timing strategy is a significant challenge. This paper proposes a traffic light timing optimization method based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL), namely EP-D3QN, which controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, whether the phase is switched based on the rules we set in advance and the pressure of the lane. In EP-D3QN, each intersection corresponds to an agent, and the road entering the intersection is divided into grids, each grid stores the speed and position of a car, thus forming the vehicle information matrix, and as the state of the agent. The action of the agent is a set of traffic light phase in a signal cycle, which has four values. The effective duration of the traffic lights is 0-60 s, and the traffic light phases switching depends on its press and the rules we set. The reward of the agent is the difference between the sum of the accumulated waiting time of all vehicles in two consecutive signal cycles. The SUMO is used to simulate two traffic scenarios. We selected two types of evaluation indicators and compared four methods to verify the effectiveness of EP-D3QN. The experimental results show that EP-D3QN has superior performance in light and heavy traffic flow scenarios, which can reduce the waiting time and travel time of vehicles, and improve the traffic efficiency of an intersection.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据