4.7 Article

A dynamic clustering technique based on deep reinforcement learning for Internet of vehicles

Journal

JOURNAL OF INTELLIGENT MANUFACTURING
Volume 32, Issue 3, Pages 757-768

Publisher

SPRINGER
DOI: 10.1007/s10845-020-01722-7

Keywords

Deep reinforcement learning; Internet of vehicles; Clustering; Reinforcement learning; Optimization

Ask authors/readers for more resources

The Internet of Vehicles (IoV) connects vehicles to the Internet to transfer information, and network clustering strategies are proposed to solve traffic management challenges in IoV networks. Reinforcement learning is used to learn optimal policies, and an experience-driven approach based on deep reinforcement learning is proposed for efficiently selecting cluster heads in managing network resources in the IoV environment.
The Internet of Vehicles (IoV) is a communication paradigm that connects the vehicles to the Internet for transferring information between the networks. One of the key challenges in IoV is the management of a massive amount of traffic generated from a large number of connected IoT-based vehicles. Network clustering strategies have been proposed to solve the challenges of traffic management in IoV networks. Traditional optimization approaches have been proposed to manage the resources of the network efficiently. However, the nature of next-generation IoV environment is highly dynamic, and the existing optimization technique cannot precisely formulate the dynamic characteristic of IoV networks. Reinforcement learning is a model-free technique where an agent learns from its environment for learning the optimal policies. We propose an experience-driven approach based on an Actor-Critic based Deep Reinforcement learning framework (AC-DRL) for efficiently selecting the cluster head (CH) for managing the resources of the network considering the noisy nature of IoV environment. The agent in the proposed AC-DRL can efficiently approximate and learn the state-action value function of the actor and action function of the critic for selecting the CH considering the dynamic condition of the network.The experimental results show an improvement of 28% and 15% respectively, in terms of satisfying the SLA requirement and 35% and 14% improvement in throughput compared to the static and DQN approaches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available