期刊
IEEE TRANSACTIONS ON COMPUTERS
卷 70, 期 7, 页码 992-1005出版社
IEEE COMPUTER SOC
DOI: 10.1109/TC.2020.2999619
关键词
Synchronization; Cloud computing; Degradation; Virtual machine monitors; Dynamic scheduling; Processor scheduling; Instruction sets; Parallel application; synchronization overhead; cloud platform; LHP problem; cache miss rate; lock latency
This paper proposes an Adaptive Time-slice Control (ATC) mechanism that can effectively optimize the performance of parallel applications by shortening time-slices during communication phases, prolonging time-slices during computation phases, and setting a uniform time-slice for non-parallel applications. Experimental results demonstrate that ATC achieves significant performance gains for running parallel applications, outperforming state-of-the-art solutions.
Cloud platforms can provide flexible and cost-effective environments for parallel applications. However, the resource over-commitment issues, i.e., cloud providers often provide much more executable virtual CPUs than available physical CPUs, still impede the synchronization operations of parallel applications, causing severe performance degradation. Existing methods optimize parallel applications by promoting the priorities of involved VMs. They cannot fully explore the performance of parallel applications, because they ignore the time-slice requirements of different phases of parallel applications. Furthermore, non-parallel applications experience unsatisfied performance because of low scheduling priorities. Given empirical analysis on time-slices of virtual machines (VMs), we find that shortening time-slices can mitigate synchronization overhead which incurs during communication phases, while over-short time-slices cause frequent cache misses in computation phases. Accordingly, we propose an Adaptive Time-slice Control (ATC) mechanism. ATC first detects the phases of parallel applications based on lock latency or cache misses. Then, ATC shortens time-slices during communication phases and prolongs time-slices during computation phases for parallel applications, and sets a uniform time-slice for non-parallel applications. We evaluate ATC using seven well-known benchmarks with 25+ applications. Experiments show that ATC obtains 1.5-75x performance gain for running parallel applications than state-of-the-art solutions, with nearly unaffected impact on non-parallel applications.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据