期刊
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
卷 71, 期 8, 页码 8615-8629出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2022.3173923
关键词
Quality of service; Autonomous vehicles; Interference; Vehicle-to-infrastructure; Fuels; Sensors; Reinforcement learning; Beamwidth control; spacing control; two-timescale reinforcement learning; vehicle platooning; vehicle-to-infrastructure communications
资金
- National Natural Science Foundation of China (NSFC) [61971286]
This paper investigates the optimization problem of minimizing IVD in VCPS to meet V2I QoS requirements, proposing a two-timescale DRL framework to control the platoon's beamforming and spacing. Research shows that the proposed algorithm can significantly improve traffic capacity, reduce air drag coefficient, while meeting V2I communication QoS requirements.
In a platoon-based vehicular cyber-physical system (VCPS), the traffic capacity and the fuel efficiency can be significantly improved by maintaining a small inter-vehicle distance (IVD). Meanwhile, the vehicle-to-infrastructure (V2I) communications provided by the roadside unit (RSU) are important for promoting the development of autonomous vehicles and reshaping the driving experience. However, as maintaining a small IVD inside the platoon may result in severe inter-beam interference, there is a tradeoff between the traffic capacity and the communication efficiency. In this paper, we formulate an optimization problem to minimize the IVD (a.k.a spacing) under the V2I quality-of-service (QoS) requirements by jointly controlling the transmit power, the beamwidth, and the IVD in a platoon-based VCPS. The traffic and channel uncertainties significantly complicate the analytical characterization of the impact of IVD, transmit power, and beamwidth on the QoS requirements. To solve the intractable challenging problem and capture the inherent timescale difference between the communication and vehicular systems, we propose a novel data-driven two-timescale reinforcement learning framework to control the beamforming and the spacing of the platoon at two timescales. With elaborately designed state, action, and reward, we develop a soft actor-critic based two-timescale deep reinforcement learning (DRL) algorithm to learn an effective policy for the control-communication co-design problem. Simulation results validate the effectiveness of spacing control and show that the proposed two-timescale DRL algorithm can significantly improve the traffic capacity and reduce the air drag coefficient while satisfying the QoS requirements of V2I communications.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据