4.7 Article

Double Deep Reinforcement Learning-Based Energy Management for a Parallel Hybrid Electric Vehicle With Engine Start-Stop Strategy

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TTE.2021.3101470

关键词

Energy management; Engines; Hybrid electric vehicles; Transportation; Mechanical power transmission; Fuel economy; Resistance; Double deep reinforcement learning (DRL); energy management strategy (EMS); engine start-stop strategy; gear-shifting strategy; hybrid electric vehicle (HEV)

资金

  1. National Natural Science Foundation of China [52072051]
  2. State Key Laboratory of Mechanical System and Vibration [MSV202016]

向作者/读者索取更多资源

This article proposes an energy management strategy based on deep reinforcement learning to optimize the fuel economy of hybrid electric vehicles. By learning gear-shifting strategies and controlling engine throttle opening, the proposed strategy successfully reduces fuel consumption and improves computational efficiency.
Committed to optimizing the fuel economy of hybrid electric vehicles (HEVs), improving the working conditions of the engine, and promoting research on deep reinforcement learning (DRL) in the field of energy management strategies (EMSs), this article first proposed a DRL-based EMS combined with a rule-based engine start-stop strategy. Moreover, considering that both the engine and the transmission are controlled components, this article developed a novel double DRL (DDRL)-based EMS, which uses a deep Q-network (DQN) to learning the gear-shifting strategy and uses a deep deterministic policy gradient (DDPG) to control the engine throttle opening, and the DDRL-based EMS realizes the multiobjective synchronization control by different types of learning algorithms. After off-line training, the simulation result of the online test shows that the fuel consumption gaps of the proposed DRL- and DDRL-based EMSs are -0.55% and 2.33% compared to that of the deterministic dynamic programming (DDP)-based EMS by overcoming some inherent flaws of DDP, respectively. The computational efficiency has been significantly improved, and the average output time per action is 0.91 ms. Therefore, the control strategy that combines learning- and rule-based controls and the multiobjective control strategies both have the potential to ensure optimization and real-time efficiency.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据