☆ 4.8 Article

Reinforcement learning and mixed-integer programming for power plant scheduling in low carbon systems: Comparison and hybridisation

APPLIED ENERGY (2023)

Journal

APPLIED ENERGY

Volume 349, Issue -, Pages -

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.apenergy.2023.121659

Keywords

Unit commitment; Reinforcement learning; Mixed-integer programming; Renewable power uncertainty

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Decarbonisation is driving the growth of renewable power generation and increasing uncertainty in power plant scheduling. This paper compares traditional mathematical programming methods with emerging reinforcement learning methods, finding that the former is more reliable and scalable with lower costs. However, the strength of reinforcement learning lies in its ability to produce instant solutions.

Decarbonisation is driving dramatic growth in renewable power generation. This increases uncertainty in the load to be served by power plants and makes their efficient scheduling, known as the unit commitment (UC) problem, more difficult. UC is solved in practice by mixed-integer programming (MIP) methods; however, there is growing interest in emerging data-driven methods including reinforcement learning (RL). In this paper, we extensively test two MIP (deterministic and stochastic) and two RL (model-free and with lookahead) scheduling methods over a large set of test days and problem sizes, for the first time comparing the state-of-the-art of these two approaches on a level playing field. We find that deterministic and stochastic MIP consistently produce lower-cost UC schedules than RL, exhibiting better reliability and scalability with problem size. Average operating costs of RL are more than 2 times larger than stochastic MIP for a 50-generator test case, while the cost is 13 times larger in the worst instance. However, the key strength of RL is the ability to produce solutions practically instantly, irrespective of problem size. We leverage this advantage to produce various initial solutions for warm starting concurrent stochastic MIP solves. By producing several near-optimal solutions simultaneously and then evaluating them using Monte Carlo methods, the differences between the true cost function and the discrete approximation required to formulate the MIP are exploited. The resulting hybrid technique outperforms both the RL and MIP methods individually, reducing total operating costs by 0.3% on average.

Reinforcement learning and mixed-integer programming for power plant scheduling in low carbon systems: Comparison and hybridisation

Journal

APPLIED ENERGY

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Reinforcement learning and mixed-integer programming for power plant scheduling in low carbon systems: Comparison and hybridisation

Journal

APPLIED ENERGY

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper