☆ 4.6 Article

Deep Reinforcement Learning for Traffic Light Timing Optimization

PROCESSES (2022)

期刊

PROCESSES

卷 10, 期 11, 页码 -

出版社

MDPI

DOI: 10.3390/pr10112458

关键词

traffic light control; deep reinforcement learning

类别

Engineering, Chemical

资金

National Key Research and Development Program of China
[2018YFB1003602]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a traffic light timing optimization method called EP-D3QN based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL). The method controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, leading to significant reductions in waiting and travel times for vehicles, and improving the efficiency of intersections.

Existing inflexible and ineffective traffic light control at a key intersection can often lead to traffic congestion due to the complexity of traffic dynamics, how to find the optimal traffic light timing strategy is a significant challenge. This paper proposes a traffic light timing optimization method based on double dueling deep Q-network, MaxPressure, and Self-organizing traffic lights (SOTL), namely EP-D3QN, which controls traffic flows by dynamically adjusting the duration of traffic lights in a cycle, whether the phase is switched based on the rules we set in advance and the pressure of the lane. In EP-D3QN, each intersection corresponds to an agent, and the road entering the intersection is divided into grids, each grid stores the speed and position of a car, thus forming the vehicle information matrix, and as the state of the agent. The action of the agent is a set of traffic light phase in a signal cycle, which has four values. The effective duration of the traffic lights is 0-60 s, and the traffic light phases switching depends on its press and the rules we set. The reward of the agent is the difference between the sum of the accumulated waiting time of all vehicles in two consecutive signal cycles. The SUMO is used to simulate two traffic scenarios. We selected two types of evaluation indicators and compared four methods to verify the effectiveness of EP-D3QN. The experimental results show that EP-D3QN has superior performance in light and heavy traffic flow scenarios, which can reduce the waiting time and travel time of vehicles, and improve the traffic efficiency of an intersection.

Deep Reinforcement Learning for Traffic Light Timing Optimization

期刊

PROCESSES

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Deep Reinforcement Learning for Traffic Light Timing Optimization

期刊

PROCESSES

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文