4.7 Article

Deterministic Promotion Reinforcement Learning Applied to Longitudinal Velocity Control for Automated Vehicles

Journal

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Volume 69, Issue 1, Pages 338-348

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2019.2955959

Keywords

Reinforcement Learning; Cold-start; Policy Gradient Method; Continuous Control; Longitudinal Velocity Control

Funding

  1. National Key Research and Development Program [2016YFB0100904]
  2. National Nature Science Foundation of China [61790564, U1664257]
  3. State Key Laboratory of Comprehensive Technology on Automobile Vibration and Noise and Safety Control [W65-GNZX-2018-0242]

Ask authors/readers for more resources

Reinforcement learning is regarded as a potential method to be applied in automated vehicles, but the stability and efficiency of algorithms are concerns. To improve them, the deterministic promotion reinforcement learning method is put forward, which can promote the policy determinately. Correspondingly, the policy evaluation in critic and the exploration in actor are improved, which combines a normalization-based evaluation and a model-free search guide. The aim is finding the right action exploration direction by critic, then the direction is used to update and guide action exploration in actor. The modified method decreases the dependencies of exploring a good action for promotional updating and only makes deterministic promotion in policy. Consequently, the efficiency of the algorithm is improved without loss in stability. More notably, it can relieve the cold-start and circumvent the limitations in learning with constrained physical systems. To verify the proposed method, the longitudinal velocity control problem for automated vehicles is considered, which contains car-following and non-car-following conditions in a unitized form. The learning system is established in Carsim. Furthermore, some different reinforcement learning technologies are used to accelerate learning. Real-vehicle experiments for validation are also given. The results indicate that the proposed method can achieve permissible learning performance in the longitudinal velocity continuous control problem.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available