4.8 Article

Reinforcement learning and mixed-integer programming for power plant scheduling in low carbon systems: Comparison and hybridisation

期刊

APPLIED ENERGY
卷 349, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.apenergy.2023.121659

关键词

Unit commitment; Reinforcement learning; Mixed-integer programming; Renewable power uncertainty

向作者/读者索取更多资源

Decarbonisation is driving the growth of renewable power generation and increasing uncertainty in power plant scheduling. This paper compares traditional mathematical programming methods with emerging reinforcement learning methods, finding that the former is more reliable and scalable with lower costs. However, the strength of reinforcement learning lies in its ability to produce instant solutions.
Decarbonisation is driving dramatic growth in renewable power generation. This increases uncertainty in the load to be served by power plants and makes their efficient scheduling, known as the unit commitment (UC) problem, more difficult. UC is solved in practice by mixed-integer programming (MIP) methods; however, there is growing interest in emerging data-driven methods including reinforcement learning (RL). In this paper, we extensively test two MIP (deterministic and stochastic) and two RL (model-free and with lookahead) scheduling methods over a large set of test days and problem sizes, for the first time comparing the state-of-the-art of these two approaches on a level playing field. We find that deterministic and stochastic MIP consistently produce lower-cost UC schedules than RL, exhibiting better reliability and scalability with problem size. Average operating costs of RL are more than 2 times larger than stochastic MIP for a 50-generator test case, while the cost is 13 times larger in the worst instance. However, the key strength of RL is the ability to produce solutions practically instantly, irrespective of problem size. We leverage this advantage to produce various initial solutions for warm starting concurrent stochastic MIP solves. By producing several near-optimal solutions simultaneously and then evaluating them using Monte Carlo methods, the differences between the true cost function and the discrete approximation required to formulate the MIP are exploited. The resulting hybrid technique outperforms both the RL and MIP methods individually, reducing total operating costs by 0.3% on average.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据