4.8 Article

Deep reinforcement learning based optimization for a tightly coupled nuclear renewable integrated energy system

Journal

APPLIED ENERGY
Volume 328, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.apenergy.2022.120113

Keywords

Nuclear renewable integrated energy system; Deep reinforcement learning; System control; Operation optimization

Funding

  1. INL Laboratory Directed Research and Development (LDRD) Program, USA under DOE Idaho Operations Office [DE-AC07-05ID14517]
  2. Office of Nuclear Energy of the U.S. Department of Energy
  3. Nuclear Science User Facilities

Ask authors/readers for more resources

This paper introduces a deep reinforcement learning (DRL)-based framework to address the complex decision-making tasks for nuclear-renewable integrated energy systems (NR-IES). Comparisons with a conventional control approach demonstrate the superiority of DRL in controlling NR-IES.
New ways to integrate energy systems to maximize efficiency are being sought to meet carbon emissions goals. Nuclear-renewable integrated energy system (NR-IES) concepts are a leading solution that couples a nuclear power plant with renewable energy, hydrogen generation plants, and energy storage systems, such that thermal and electrical power are dispatchable to fulfill grid-flexibility requirements while also producing hydrogen and maximizing revenue. This paper introduces a deep reinforcement learning (DRL)-based framework to address the complex decision-making tasks for NR-IES. The objective is to maximize revenue by generating and selling hydrogen and electricity simultaneously according to their time-varying prices while keeping the energy flow in the subsystems in balance. A Python-based simulator for a NR-IES concept has been developed to integrate with OpenAI Gym and Ray/RLlib to enable an efficient and flexible computational framework for DRL research and development. Three state-of-the-art DRL algorithms have been investigated, including two-delayed deep deterministic policy gradient (TD3), soft-actor critic (SAC), proximal policy optimization (PPO), to illustrate DRL's superiority for controlling NR-IES by comparing it with a conventional control approach, particle swarm optimization (PSO). In this effort, PPO has shown more-stable performance and also better generalization capability than SAC and TD3. Comparisons with PSO have demonstrated that, on average, PPO can achieve 13.9% more mean episode returns from the training process and 29.4% more mean episode returns from the testing process when different hydrogen-production targets are applied.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available