期刊
IEEE COMPUTER ARCHITECTURE LETTERS
卷 18, 期 2, 页码 157-160出版社
IEEE COMPUTER SOC
DOI: 10.1109/LCA.2019.2955119
关键词
Graphics processing units; Task analysis; Bandwidth; Data transfer; Switches; Quality of service; Throughput; Multi-GPU; multi-tenant; PCIe scheduling
资金
- NSF [CNS-1525412]
- NSF of China [61672526]
- Research Project of NUDT [ZK17-03-06]
- Science and Technology Innovation Project of Hunan Province [2018RS3083]
Multi-GPU systems are widely used in data centers to provide significant speedups to compute-intensive workloads such as deep neural network training. However, limited PCIe bandwidth between the CPU and multiple GPUs becomes a major performance bottleneck. We observe that relying on a traditional Round-Robin-based PCIe scheduling policy can result in severe bandwidth competition and stall the execution of multiple GPUs. In this article, we propose a priority-based scheduling policy which aims to overlap the data transfers and GPU execution for different applications to alleviate this bandwidth contention. We also propose a dynamic priority policy for semi-QoS management that can help applications to meet QoS requirements and improve overall multi-GPU system throughput. Experimental results show that the system throughput is improved by 7.6 percent on average using our priority-based PCIe scheduling scheme as compared with a Round-Robin-based PCIe scheduler. Leveraging semi-QoS management can help to meet defined QoS goals, while preserving application throughput.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据