4.7 Article

A novel energy management strategy of hybrid electric vehicle via an improved TD3 deep reinforcement learning

Journal

ENERGY
Volume 224, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.energy.2021.120118

Keywords

Hybrid electric vehicle; Energy management strategy; Deep reinforcement learning; TD3

Funding

  1. National Natural Science Foundation of China [51805254]
  2. Fundamental Research Funds of Jiangsu Province Key Laboratory of Aerospace Power System [CEPE2019001]
  3. China Postdoctoral Science Foundation [2018M642244]
  4. National Key Laboratory of Science and Technology on Helicopter Transmission [HTLA20K02]

Ask authors/readers for more resources

In this study, a deep reinforcement learning algorithm, TD3, is used to develop an intelligent energy management strategy (EMS) for hybrid electric vehicles, including a local controller (LC) and a hybrid experience replay method (HER). The improved TD3-based EMS shows the best fuel optimization performance, fastest convergence speed, and highest robustness under different driving cycles.
The formulation of high-efficient energy management strategy (EMS) for hybrid electric vehicles (HEVs) becomes the most crucial task owing to the variation of electrified powertrain topology and uncertainty of driving scenarios. In this study, a deep reinforcement learning (DRL) algorithm, namely TD3, is leveraged to derivate intelligent EMS for HEV. A heuristic rule-based local controller (LC) is embedded within the DRL loop to eliminate irrational torque allocation with considering the characteristics of powertrain components. In order to resolve the influence of environmental disturbance, a hybrid experience replay (HER) method is proposed based on a mixed experience buffer (MEB) consisting of offline computed optimal experience and online learned experience. The results indicate that improved TD3 based EMS obtained the best fuel optimality, fastest convergence speed and highest robustness in comparison to typical value-based and policy-based DRL EMSs under various driving cycles. LC leads to a boosting effect on the convergence speed of TD3-based EMS wherein a warm start of exploring is exhibited. Meanwhile, by incorporating HER coupled with MEB, the impact of environmental disturbance including load mass and road gradient, as an increase of input observations, can be negligible to the performance of TD3-based EMS. (c) 2021 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available