4.7 Article

Scalable Deep Reinforcement Learning-Based Online Routing for Multi-Type Service Requirements

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPDS.2023.3284651

Keywords

QoS routing; reinforcement learning; scalability

Ask authors/readers for more resources

Emerging applications create critical QoS requirements for the Internet. The advancements in flow classification technologies, software-defined networks (SDN), and programmable network devices enable fast identification of users' requirements and control of routing for fine-grained traffic flows. However, the problem of optimizing forwarding paths for traffic flows with multiple QoS requirements in an online fashion has not been adequately addressed. To tackle this problem, we propose DRL-OR-S, a highly scalable online routing algorithm that utilizes multi-agent deep reinforcement learning to learn suitable routing strategies for different flow requirements.
Emerging applications raise critical QoS requirements for the Internet. The improvements in flow classification technologies, software-defined networks (SDN), and programmable network devices make it possible to fast identify users' requirements and control the routing for fine-grained traffic flows. Meanwhile, the problem of optimizing the forwarding paths for traffic flows with multiple QoS requirements in an online fashion is not addressed sufficiently. To address the problem, we propose DRL-OR-S, a highly scalable online routing algorithm using multi-agent deep reinforcement learning. DRL-OR-S adopts a comprehensive reward function, an efficient learning algorithm, and a novel deep neural network structure to learn appropriate routing strategies for different types of flow requirements. In order to enhance the generalization and scalability, we propose a novel graph-based actor-critic network architecture and a carefully designed input state for DRL-OR-S. To accelerate the training process and guarantee reliability, we further introduce an NN-simulator for efficient offline training and a safe learning mechanism to avoid unsafe routes during the online routing process. We implement DRL-OR-S under SDN architecture and conduct Mininet-based experiments using real network topologies and traffic traces. The results validate that DRL-OR-S can well satisfy the requirements of latency-sensitive, throughput-sensitive, latency-throughput-sensitive, and latency-loss-sensitive flows at the same time, while exhibiting great adaptiveness and reliability under the scenarios of link failure, traffic change, unseen large topology and partial deployment.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available