☆ 4.7 Article

Optimal policy for structure maintenance: A deep reinforcement learning framework

STRUCTURAL SAFETY (2020)

期刊

STRUCTURAL SAFETY

卷 83, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.strusafe.2019.101906

关键词

Bridge maintenance policy; Deep reinforcement learning (DRL); Markov decision process (MDP); Deep Q-network (DQN); Convolutional neural network (CNN)

类别

Engineering, Civil

资金

NSFC [U1711265, 51638007, 51678203, 51478149, 51678204]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The cost-effective management of aged infrastructure is an issue of worldwide concern. Markov decision process (MDP) models have been used in developing structural maintenance policies. Recent advances in the artificial intelligence (AI) community have shown that deep reinforcement learning (DRL) has the potential to solve large MDP optimization tasks. This paper proposes a novel automated DRL framework to obtain an optimized structural maintenance policy. The DRL framework contains a decision maker (AI agent) and the structure that needs to be maintained (AI task environment). The agent outputs maintenance policies and chooses maintenance actions, and the task environment determines the state transition of the structure and returns rewards to the agent under given maintenance actions. The advantages of the DRL framework include: (1) a deep neural network (DNN) is employed to learn the state-action Q value (defined as the predicted discounted expectation of the return for consequences under a given state-action pair), either based on simulations or historical data, and the policy is then obtained from the Q value; (2) optimization of the learning process is sample-based so that it can learn directly from real historical data collected from multiple bridges (i.e., big data from a large number of bridges); and (3) a general framework is used for different structure maintenance tasks with minimal changes to the neural network architecture. Case studies for a simple bridge deck with seven components and a long-span cable-stayed bridge with 263 components are performed to demonstrate the proposed procedure. The results show that the DRL is efficient at finding the optimal policy for maintenance tasks for both simple and complex structures.

Optimal policy for structure maintenance: A deep reinforcement learning framework

期刊

STRUCTURAL SAFETY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Optimal policy for structure maintenance: A deep reinforcement learning framework

期刊

STRUCTURAL SAFETY

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文