4.8 Article

Reinforcement learning and mixed-integer programming for power plant scheduling in low carbon systems: Comparison and hybridisation

Journal

APPLIED ENERGY
Volume 349, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.apenergy.2023.121659

Keywords

Unit commitment; Reinforcement learning; Mixed-integer programming; Renewable power uncertainty

Ask authors/readers for more resources

Decarbonisation is driving the growth of renewable power generation and increasing uncertainty in power plant scheduling. This paper compares traditional mathematical programming methods with emerging reinforcement learning methods, finding that the former is more reliable and scalable with lower costs. However, the strength of reinforcement learning lies in its ability to produce instant solutions.
Decarbonisation is driving dramatic growth in renewable power generation. This increases uncertainty in the load to be served by power plants and makes their efficient scheduling, known as the unit commitment (UC) problem, more difficult. UC is solved in practice by mixed-integer programming (MIP) methods; however, there is growing interest in emerging data-driven methods including reinforcement learning (RL). In this paper, we extensively test two MIP (deterministic and stochastic) and two RL (model-free and with lookahead) scheduling methods over a large set of test days and problem sizes, for the first time comparing the state-of-the-art of these two approaches on a level playing field. We find that deterministic and stochastic MIP consistently produce lower-cost UC schedules than RL, exhibiting better reliability and scalability with problem size. Average operating costs of RL are more than 2 times larger than stochastic MIP for a 50-generator test case, while the cost is 13 times larger in the worst instance. However, the key strength of RL is the ability to produce solutions practically instantly, irrespective of problem size. We leverage this advantage to produce various initial solutions for warm starting concurrent stochastic MIP solves. By producing several near-optimal solutions simultaneously and then evaluating them using Monte Carlo methods, the differences between the true cost function and the discrete approximation required to formulate the MIP are exploited. The resulting hybrid technique outperforms both the RL and MIP methods individually, reducing total operating costs by 0.3% on average.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available