☆ 4.7 Article

Preemptive and Low Latency Datacenter Scheduling via Lightweight Containers

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2020)

期刊

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

卷 31, 期 12, 页码 2749-2762

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TPDS.2019.2957754

关键词

Task analysis; Sparks; Yarn; Containers; Delays; Resource management; Processor scheduling; Datacenter scheduling; OS lightweight virtualization; preemption; multi-tenancy; heterogeneous workloads

类别

Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

U.S. NSF [SHF-1816850, CNS-1422119]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Datacenters are evolving to host heterogeneous workloads on shared clusters to reduce the operational cost and achieve higher resource utilization. However, it is challenging to schedule heterogeneous workloads with diverse resource requirements and QoS constraints. On one hand, latency-critical jobs need to be scheduled as soon as they are submitted to avoid any queuing delays. On the other hand, best-effort long jobs should be allowed to occupy the cluster when there are idle resources to improve cluster utilization. The challenge lies in how to minimize the queuing delays of short jobs while maximizing cluster utilization. In this article, we propose and develop BIG-C, a container-based resource management framework for data-intensive cluster computing. The key design is to leverage lightweight virtualization, a.k.a, containers, to make tasks preemptable in cluster scheduling. We devise two types of preemption strategies: immediate and graceful preemptions and show their effectiveness and tradeoffs with loosely-coupled MapReduce workloads as well as iterative, in-memory Spark workloads. Based on the mechanisms for task preemption, we further develop job-level and task-level preemptive policies as well as a preemptive fair share cluster scheduler. Our implementation on Yarn and evaluation with synthetic and production workloads show that low job latency and high resource utilization can be both attained when scheduling heterogeneous workloads on a contended cluster.

Preemptive and Low Latency Datacenter Scheduling via Lightweight Containers

期刊

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Preemptive and Low Latency Datacenter Scheduling via Lightweight Containers

期刊

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文