☆ 4.7 Article

Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems

STRUCTURAL SAFETY (2019)

Journal

STRUCTURAL SAFETY

Volume 80, Issue -, Pages 46-55

Publisher

ELSEVIER

DOI: 10.1016/j.strusafe.2019.04.003

Keywords

Reinforcement learning; Extreme events; Resilience; Infrastructure systems

Funding

National Science Foundation [1663479]
Directorate For Engineering
Div Of Civil, Mechanical, & Manufact Inn [1663479] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Extreme events represent not only some of the most damaging events in our society and environment, but also the most difficult to predict. Model-based predictions of the disruptions induced by extreme events on urban infrastructure systems are often unreliable, as these events are unlikely by their very definition. Specifically, characterizing the effect of such disruptions to the urban infrastructure using a parameterized model is a difficult task. On the other hand, model-free approaches based on recent advancements in reinforcement learning can model the complex dynamics of urban society and infrastructure under the risk of extreme events explicitly without relying on any specific physics-based mechanism. However, these approaches usually require performing random exploration of the effects of management actions on the system (typically in the post-event situation) to allow for an acceptable approximation to the optimal management policy. When dealing with costly infrastructure systems and important communities, this random exploration can be unacceptable and risky. In this paper, we propose a method called Safe Q-leaming, which is a model-free reinforcement learning approach with addition of a model-based safe exploration for near-optimal management of infrastructure system pre-event and their recovery post-event. Our method requires the decision-maker to model the structure of the state space of the problem, and a suitable equilibrium of the system (optimum functionality pre-event). This information is usually available for urban systems, as they spend long time in optimum equilibrium before the occurrence of such events. We show on several examples of infrastructure management how the proposed approach is able to achieve near-optimal performance without the risk due to random exploration.

Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems

Journal

STRUCTURAL SAFETY

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Model-free reinforcement learning with model-based safe exploration: Optimizing adaptive recovery process of infrastructure systems

Journal

STRUCTURAL SAFETY

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper