3.8 Review

Literature Review of Deep Network Compression

Journal

INFORMATICS-BASEL
Volume 8, Issue 4, Pages -

Publisher

MDPI
DOI: 10.3390/informatics8040077

Keywords

deep learning; neural networks pruning; model compression

Funding

  1. Deanship of Scientific Research, King Khalid University of Kingdom of Saudi Arabia [RGP1/207/42]

Ask authors/readers for more resources

This paper provides an overview of popular methods for compressing and accelerating deep neural networks, including pruning, quantization, and low-rank factorization. The focus is on reducing redundancy in model parameters while maintaining performance advantages.
Deep networks often possess a vast number of parameters, and their significant redundancy in parameterization has become a widely-recognized property. This presents significant challenges and restricts many deep learning applications, making the focus on reducing the complexity of models while maintaining their powerful performance. In this paper, we present an overview of popular methods and review recent works on compressing and accelerating deep neural networks. We consider not only pruning methods but also quantization methods, and low-rank factorization methods. This review also intends to clarify these major concepts, and highlights their characteristics, advantages, and shortcomings.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available