4.6 Article

Comparative Analysis of Energy Management Strategies for HEV: Dynamic Programming and Reinforcement Learning

Journal

IEEE ACCESS
Volume 8, Issue -, Pages 67112-67123

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2020.2986373

Keywords

Hybrid electric vehicles; Energy management; Optimal control; Engines; Dynamic programming; Fuel economy; Learning (artificial intelligence); Dynamic programming; hybrid electric vehicle; optimal control; reinforcement learning; power management

Funding

  1. Ministry of Trade, Industry, and Energy (MOTIE), South Korea [20002762]
  2. Korea Evaluation Institute of Industrial Technology (KEIT) [20002762] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

Ask authors/readers for more resources

Energy management strategy is an important factor in determining the fuel economy of hybrid electric vehicles; thus, much research on how to distribute the required power to engines and motors of hybrid vehicles is required. Recently, various studies have been conducted based on reinforcement learning to optimally control the hybrid electric vehicle. In fact, the fundamental control approach of reinforcement learning shares many control frameworks with the control approach by using deterministic dynamic programming or stochastic dynamic programming. In this study, we compare the reinforcement learning based strategy by using these dynamic programming-based control approaches. For optimal control of hybrid electric vehicle, each control method was compared in terms of fuel efficiency by performing simulation by using various driving cycles. Based on our simulations, we showed the reinforcement learning-based strategy can obtain global optimality in the optimal control problem with an infinite horizon, which can also be obtained by stochastic dynamic programming. We also showed that the reinforcement learning-based strategy can present a solution close to the optimal one using deterministic dynamic programming, while a reinforcement learning-based strategy is more appropriate for a time variant controller with boundary value constraints. In addition, we verified the convergence characteristics of the control strategy based on reinforcement learning, when transfer learning was performed through value initialization using stochastic dynamic programming.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available