☆ 3.8 Proceedings Paper

Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020) (2020)

Journal

2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020)

Volume -, Issue -, Pages 307-311

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/PDP50117.2020.00053

Keywords

Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.

Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

Journal

2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

Journal

2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020)

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper