4.7 Article

On-Device Deep Multi-Task Inference via Multi-Task Zipping

Journal

IEEE TRANSACTIONS ON MOBILE COMPUTING
Volume 22, Issue 5, Pages 2878-2891

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TMC.2021.3124306

Keywords

Task analysis; Deep learning; Neurons; Training; Biological neural networks; Redundancy; Mobile computing; Deep neural networks; model compression; multi-task learning

Ask authors/readers for more resources

Future mobile devices are expected to have the ability to perceive, understand, and react to the world independently using deep neural networks. However, these models need to be compressed to fit in mobile storage and memory. This work proposes Multi-Task Zipping (MTZ), a framework that automatically merges pre-trained deep neural networks to reduce redundancy across multiple models. MTZ achieves this through layer-wise neuron sharing and weight updating schemes, which result in minimal changes to the error function. Evaluations show that MTZ effectively merges networks with minimal increase in test errors and significantly reduces the number of iterations required for retraining. It also improves the latency for switching between different tasks on memory-constrained devices.
Future mobile devices are anticipated to perceive, understand and react to the world on their own by running multiple correlated deep neural networks locally on-device. Yet the complexity of these deep models needs to be trimmed down both within model and cross-model to fit in mobile storage and memory. Previous studies squeeze the redundancy within a single model. In this work, we aim to reduce the redundancy across multiple models. We propose Multi-Task Zipping (MTZ), a framework to automatically merge correlated, pre-trained deep neural networks for cross-model compression. Central in MTZ is a layer-wise neuron sharing and incoming weight updating scheme that induces a minimal change in the error function. MTZ inherits information from each model and demands light retraining to re-boost the accuracy of individual tasks. MTZ supports typical network layers (fully-connected, convolutional and residual) and applies to inference tasks with different input domains. Evaluations show that MTZ can fully merge the hidden layers of two VGG-16 networks with a 3.18% increase in the test error averaged on ImageNet for object classification and CelebA for facial attribute classification, or share 39.61% parameters between the two networks with < 0.5% increase in the test errors. The number of iterations to retrain the combined network is at least 17.8x lower than that of training a single VGG-16 network. Moreover, MTZ can effectively merge nine residual networks for diverse inference tasks and models for different input domains. And with the model merged by MTZ, the latency to switch between these tasks on memory-constrained devices is reduced by 8.71x.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available