4.7 Article

Double Deep Reinforcement Learning-Based Energy Management for a Parallel Hybrid Electric Vehicle With Engine Start-Stop Strategy

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TTE.2021.3101470

Keywords

Energy management; Engines; Hybrid electric vehicles; Transportation; Mechanical power transmission; Fuel economy; Resistance; Double deep reinforcement learning (DRL); energy management strategy (EMS); engine start-stop strategy; gear-shifting strategy; hybrid electric vehicle (HEV)

Funding

  1. National Natural Science Foundation of China [52072051]
  2. State Key Laboratory of Mechanical System and Vibration [MSV202016]

Ask authors/readers for more resources

This article proposes an energy management strategy based on deep reinforcement learning to optimize the fuel economy of hybrid electric vehicles. By learning gear-shifting strategies and controlling engine throttle opening, the proposed strategy successfully reduces fuel consumption and improves computational efficiency.
Committed to optimizing the fuel economy of hybrid electric vehicles (HEVs), improving the working conditions of the engine, and promoting research on deep reinforcement learning (DRL) in the field of energy management strategies (EMSs), this article first proposed a DRL-based EMS combined with a rule-based engine start-stop strategy. Moreover, considering that both the engine and the transmission are controlled components, this article developed a novel double DRL (DDRL)-based EMS, which uses a deep Q-network (DQN) to learning the gear-shifting strategy and uses a deep deterministic policy gradient (DDPG) to control the engine throttle opening, and the DDRL-based EMS realizes the multiobjective synchronization control by different types of learning algorithms. After off-line training, the simulation result of the online test shows that the fuel consumption gaps of the proposed DRL- and DDRL-based EMSs are -0.55% and 2.33% compared to that of the deterministic dynamic programming (DDP)-based EMS by overcoming some inherent flaws of DDP, respectively. The computational efficiency has been significantly improved, and the average output time per action is 0.91 ms. Therefore, the control strategy that combines learning- and rule-based controls and the multiobjective control strategies both have the potential to ensure optimization and real-time efficiency.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available