☆ 4.7 Article

Optimal policy for structure maintenance: A deep reinforcement learning framework

STRUCTURAL SAFETY (2020)

Journal

STRUCTURAL SAFETY

Volume 83, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.strusafe.2019.101906

Keywords

Bridge maintenance policy; Deep reinforcement learning (DRL); Markov decision process (MDP); Deep Q-network (DQN); Convolutional neural network (CNN)

Funding

NSFC [U1711265, 51638007, 51678203, 51478149, 51678204]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The cost-effective management of aged infrastructure is an issue of worldwide concern. Markov decision process (MDP) models have been used in developing structural maintenance policies. Recent advances in the artificial intelligence (AI) community have shown that deep reinforcement learning (DRL) has the potential to solve large MDP optimization tasks. This paper proposes a novel automated DRL framework to obtain an optimized structural maintenance policy. The DRL framework contains a decision maker (AI agent) and the structure that needs to be maintained (AI task environment). The agent outputs maintenance policies and chooses maintenance actions, and the task environment determines the state transition of the structure and returns rewards to the agent under given maintenance actions. The advantages of the DRL framework include: (1) a deep neural network (DNN) is employed to learn the state-action Q value (defined as the predicted discounted expectation of the return for consequences under a given state-action pair), either based on simulations or historical data, and the policy is then obtained from the Q value; (2) optimization of the learning process is sample-based so that it can learn directly from real historical data collected from multiple bridges (i.e., big data from a large number of bridges); and (3) a general framework is used for different structure maintenance tasks with minimal changes to the neural network architecture. Case studies for a simple bridge deck with seven components and a long-span cable-stayed bridge with 263 components are performed to demonstrate the proposed procedure. The results show that the DRL is efficient at finding the optimal policy for maintenance tasks for both simple and complex structures.

Optimal policy for structure maintenance: A deep reinforcement learning framework

Journal

STRUCTURAL SAFETY

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Optimal policy for structure maintenance: A deep reinforcement learning framework

Journal

STRUCTURAL SAFETY

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper