4.7 Article

Model-free perimeter metering control for two-region urban networks using deep reinforcement learning

Journal

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.trc.2020.102949

Keywords

Macroscopic Fundamental Diagram (MFD); Model free deep reinforcement learning (Deep-RL); Perimeter control

Funding

  1. NSF [CMMI-1749200]
  2. Penn State Institute of CyberScience

Ask authors/readers for more resources

This paper proposes a model free deep reinforcement learning perimeter control scheme for two-region urban networks, where agents can choose actions without needing any information about the environment dynamics. Results from extensive numerical experiments show that the proposed scheme can consistently learn perimeter control strategies under various environment configurations, perform comparably to state-of-the-art model predictive control (MPC), and exhibit high transferability to a wide range of traffic conditions and dynamics in the environment.
Various perimeter metering control strategies have been proposed for urban traffic networks that rely on the existence of well-defined relationships between network productivity and accumulation, known more commonly as network Macroscopic Fundamental Diagrams (MFD). Most existing perimeter metering control strategies require accurate modeling of traffic dynamics with full knowledge of the network MFD and dynamic equations to describe how vehicles move across regions of the network. However, such information is generally difficult to obtain and subject to error. Some model free perimeter metering control schemes have been recently proposed in the literature. However, these existing approaches require estimates of network properties (e.g., the critical accumulation associated with maximum network productivity) in the controller designs. In this paper, a model free deep reinforcement learning perimeter control (MFDRLPC) scheme is proposed for two-region urban networks that features agents with either continuous or discrete action spaces. The proposed agents learn to select control actions through a reinforcement learning process without assuming any information about environment dynamics. Results from extensive numerical experiments demonstrate that the proposed agents: (a) can consistently learn perimeter control strategies under various environment configurations; (b) are comparable in performance to the state-of-the-art, model predictive control (MPC); and, (c) are highly transferable to a wide range of traffic conditions and dynamics in the environment.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available