☆ 4.7 Article

Risk-informed operation and maintenance of complex lifeline systems using parallelized multi-agent deep Q-network

RELIABILITY ENGINEERING & SYSTEM SAFETY (2023)

期刊

RELIABILITY ENGINEERING & SYSTEM SAFETY

卷 239, 期 -, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.ress.2023.109512

关键词

Deep reinforcement learning; Lifeline systems; Life-cycle cost; Markov decision process; Operation & amp; maintenance; Parallel processing

类别

Engineering, Industrial Operations Research & Management Science

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

A multiagent deep reinforcement learning framework called parallelized multi-agent deep Q-network (PM-DQN) is proposed to overcome the curse of dimensionality in complex systems. The method divides the system into multiple subsystems, with each agent learning the operation and maintenance policy of the corresponding subsystem. The learning processes occur simultaneously in parallel units, and the trained policies are periodically synchronized to improve the master policy. Numerical examples demonstrate that the proposed method outperforms baseline policies.

Lifeline systems such as transportation and water distribution networks may deteriorate with age, raising the risk of system failure or degradation. Thus, system-level sequential decision-making is essential to address the problem cost-effectively while minimizing the potential loss. Researchers have proposed to assess the risk of lifeline systems using Markov decision processes (MDPs) to identify a risk-informed operation and maintenance (O & M) policy. In complex systems with many components, however, it is potentially intractable to find MDP solutions because the numbers of states and action spaces increase exponentially. This paper proposes a multiagent deep reinforcement learning framework, termed parallelized multi-agent deep Q-network (PM-DQN), to overcome the curse of dimensionality. The proposed method takes a divide-and-conquer strategy, in which multiple subsystems are identified by community detection, and each agent learns to achieve the O & M policy of the corresponding subsystem. The agents establish policies to minimize the decentralized cost of the cluster unit, including the factorized cost. Such learning processes occur simultaneously in several parallel units, and the trained policies are periodically synchronized with the best ones, thereby improving the master policy. Numerical examples demonstrate that the proposed method outperforms baseline policies, including conventional maintenance schemes and the subsystem-level optimal policy.

Risk-informed operation and maintenance of complex lifeline systems using parallelized multi-agent deep Q-network

期刊

RELIABILITY ENGINEERING & SYSTEM SAFETY

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Risk-informed operation and maintenance of complex lifeline systems using parallelized multi-agent deep Q-network

期刊

RELIABILITY ENGINEERING & SYSTEM SAFETY

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文