4.7 Article

Deep-Reinforcement-Learning-Based Capacity Scheduling for PV-Battery Storage System

期刊

IEEE TRANSACTIONS ON SMART GRID
卷 12, 期 3, 页码 2272-2283

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TSG.2020.3047890

关键词

Batteries; Safety; Power grids; Frequency control; Aerospace electronics; Regulation; Energy management; Battery storage systems; deep reinforcement learning; energy arbitrage; frequency regulation

向作者/读者索取更多资源

This article proposes a Proximal Policy Optimization-based deep reinforcement learning agent for capacity scheduling of photovoltaic-battery storage systems, aiming to ensure secure and economic operation. The PPO agent is more flexible in handling continuous action space compared to other methods, allowing it to maximize revenue while enforcing safety constraints.
Investor-owned photovoltaic-battery storage systems (PV-BSS) can gain revenue by providing stacked services, including PV charging and frequency regulation, and by performing energy arbitrage. Capacity scheduling (CS) is a crucial component of PV-BSS energy management, aiming to ensure the secure and economic operation of the PV-BSS. This article proposes a Proximal Policy Optimization (PPO)-based deep reinforcement learning (DRL) agent to perform the CS of PV-BSS. Unlike previous work that uses value-based methods with the discrete action space, PPO can readily handle continuous action space and determine the specific amount of charging/discharging. To enforce the safety constraints of BSS's energy and power capacity, a safety control algorithm using a serial strategy is proposed to cooperate with the PPO agent. The PPO agent can exploit the capacity of BSS safely while maximizing the accumulated net revenue. After training, the PPO agent can adapt to the highly uncertain and volatile market signals and PV generation profiles. The efficacy of the proposed CS scheme is substantiated by using real market data. The comparative results demonstrate that the PPO agent outperforms the Deep Deterministic Policy Gradient agent, Advantage Actor-Critic agent, and Double Deep Q Network agent in terms of profitability and sample efficiency.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据