4.6 Article

SCARL: Attentive Reinforcement Learning-Based Scheduling in a Multi-Resource Heterogeneous Cluster

Journal

IEEE ACCESS
Volume 7, Issue -, Pages 153432-153444

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2019.2948150

Keywords

Resource management; Reinforcement learning; Clustering algorithms; Job shop scheduling; Complexity theory; Servers; Cluster resource management; attentive reinforcement learning; attention; attentive embedding

Funding

  1. Basic Science Research Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT and Future Planning [NRF-2016R1E1A1A01943474, NRF-2018R1D1A1A02086102]

Ask authors/readers for more resources

Advanced reinforcement learning (RL) technologies have recently increased the opportunity for automating several tasks in cluster management at scale by exploiting repetitive logs of cluster operation and building a learning model for resource allocation and job scheduling. Yet, this trend of adopting RL in the domain of cluster management has not fully addressed the diversity and heterogeneity of jobs and machines in modern cluster environments. In this paper, we present an RL-based scheduler for a multi-resource cluster, namely SCARL (SCheduler with Attentive Reinforcement Learning), concentrating on intricate cluster operating conditions with different resource requirements and capabilities. Specifically, we employ attentive embedding and factored-action scheduling that together efficiently incorporate time-varying interdependency of jobs and machines in RL processing; they enable an end-to-end scalable policy for scheduling diverse jobs on heterogeneous machines. To the best of our knowledge, we are the first to employ attention mechanism in RL-based cluster resource management. Through experiments, we demonstrate that our approach is competitive with existing heuristic methods under various cluster simulation configurations, e.g., an average 9.2 enhancement in slowdown over the shortest job first algorithm. Additionally, the approach yields stable performance with our test cluster for running synthetic workloads based on real traces.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available