4.8 Article

Autonomous Input Voltage Sharing Control and Triple Phase Shift Modulation Method for ISOP-DAB Converter in DC Microgrid: A Multiagent Deep Reinforcement Learning-Based Method

期刊

IEEE TRANSACTIONS ON POWER ELECTRONICS
卷 38, 期 3, 页码 2985-3000

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TPEL.2022.3218900

关键词

Microgrids; Voltage control; Stress; Uncertainty; Minimization; Inductors; Training; Input-series output-parallel-connected dual active bridge (ISOP-DAB) converter; input voltage sharing (IVS); multiagent twin-delayed deep deterministic policy gradient (MA-TD3); triple phase shift modulation

向作者/读者索取更多资源

This article proposes a multiagent deep reinforcement learning (DRL) based approach for controlling input voltage sharing and modulation in an input-series output-parallel dual active bridge converter. The proposed method addresses the challenges of uncertainties in the dc microgrid, power balance, and current stress minimization. The control and modulation problem is formulated as a Markov game and solved using the MA twin-delayed deep deterministic policy gradient (MA-TD3) algorithm. Simulation and experimental results demonstrate the effectiveness of the proposed approach.
This article proposes a multiagent (MA) deep reinforcement learning (DRL) based autonomous input voltage sharing (IVS) control and triple phase shift modulation method for input-series output-parallel (ISOP) dual active bridge (DAB) converters to solve the three challenges: the uncertainties of the dc microgrid, the power balance problem, and the current stress minimization of the converter. Specifically, the control and modulation problem of the ISOP-DAB converter is formed as a Markov game with several DRL agents. Subsequently, the MA twin-delayed deep deterministic policy gradient (MA-TD3) algorithm is applied to train the DRL agents in an offline manner. After the training process, the multiple agents can provide online control decisions for the ISOP-DAB converter to balance the IVS, and minimize the current stress among different submodules. Without accurate model information, the proposed method can adaptively obtain the optimal modulation variable combinations in a stochastic and uncertain environment. Simulation and experimental results verify the effectiveness of the proposed MA-TD3-based algorithm.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据