3.8 Proceedings Paper

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge

出版社

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-030-01418-6_40

关键词

Neural network quantization; Auto-tuning framework; Edge computing; Collaborative inference

资金

  1. National Key R&D Program of China [2017YFB0202002]
  2. Science Fund for Creative Research Groups of the National Natural Science Foundation of China [61521092]
  3. Key Program of National Natural Science Foundation of China [61432018, 61332009, U1736208]

向作者/读者索取更多资源

Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting data via wireless network. In this paper, we demonstrate the advantages of the cloud-edge collaborative inference with quantization. By analyzing the characteristics of layers in DNNs, an auto-tuning neural network quantization framework for collaborative inference is proposed. We study the effectiveness of mixed-precision collaborative inference of state-of-the-art DNNs by using ImageNet dataset. The experimental results show that our framework can generate reasonable network partitions and reduce the storage on mobile devices with trivial loss of accuracy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据