3.8 Proceedings Paper

Pipelined Data-Parallel CPU/GPU Scheduling for Multi-DNN Real-Time Inference

Journal

Publisher

IEEE
DOI: 10.1109/RTSS46320.2019.00042

Keywords

-

Ask authors/readers for more resources

Deep neural networks (DNNs) have been showing significant success in various applications, such as autonomous driving, mobile devices, and Internet of Things. Although much research has been conducted to optimize the structure of DNNs, limited attention has been given to their timely execution, specifically on the scheduling of real-time inference requests to various DNN models. For instance, existing DNN frameworks, such as Caffe, TensorFlow and Torch, only provide a single-level priority, one-DNN-per-process execution model and sequential inference interfaces. They can be particularly problematic when used in edge computing and in-vehicle intelligence systems for multiple DNNs, as response time may become unpredictably long in the worst case while leaving system resources underutilized. This paper presents DART, a DNN scheduling framework that offers deterministic response time to real-time tasks and increased throughput to best-effort tasks. DART employs a pipeline-based scheduling architecture with data parallelism, where heterogeneous CPUs and GPUs are arranged into nodes with different parallelism levels. DART also includes pipeline stage design and node configuration schemes, admission control, execution time profiling, and runtime enforcement techniques. We evaluated DART on Intel x86 Xeon and Nvidia ARM platforms with GPUs. Experimental results indicate that DART significantly outperforms the existing approaches, by up to 98.5% shorter worst-case response time for real-time tasks while simultaneously achieving up to 17.9% higher throughput for best-effort tasks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available