期刊
MOBILE NETWORKS & APPLICATIONS
卷 26, 期 3, 页码 1243-1249出版社
SPRINGER
DOI: 10.1007/s11036-019-01345-0
关键词
Deep learning; Mobile computing; Inference; Distributed algorithm
This paper proposes collaborative inference among mobile devices to share computation workloads and accelerate processing speed by batching inference tasks on GPUs. An algorithm based on PSO is designed for efficient collaboration, as well as a distributed algorithm to address the challenge of collecting global network information and running centralized algorithms. Extensive simulations show that the collaborative inference scheme effectively reduces inference time for mobile deep learning applications.
Deep learning stimulates many novel mobile applications, but it is still challenging to enable efficient mobile deep learning applications. Traditional approach tackles this challenge by offloading computation tasks to cloud, which has weaknesses of high bandwidth requirements and long transmission latency. In this paper, we propose to enable collaborative inference among mobile devices. Instead of sending deep learning inference tasks to cloud, we let mobile devices collaboratively share the computation workloads. This is based on an important observation that batching inference tasks on GPUs can accelerate the inference processing speed. To achieve efficient collaboration, we design an algorithm based on partial swarm optimization (PSO) that is a versatile population-based stochastic optimization technique. We also design a distributed algorithm to address the challenge that is difficult to collect global network information and run the centralized algorithm. Moreover, extensive simulations are conducted to evaluate the performance of the designed algorithm. The simulation results show that the collaborative inference scheme can effectively reduce inference time of mobile deep learning applications.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据