☆ 4.8 Article

DNN Model Compression for IoT Domain-Specific Hardware Accelerators

IEEE INTERNET OF THINGS JOURNAL (2022)

期刊

IEEE INTERNET OF THINGS JOURNAL

卷 9, 期 9, 页码 6650-6662

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/JIOT.2021.3111723

关键词

Deep neural network (DNN) accelerator; DNN model compression; domain-specific accelerator; energy versus performance versus accuracy tradeoff; neural networks

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

资金

Italian Ministry of University and Research (MIUR) [ARS01_00345, ARS01_00353]
University of Catania-Piaceri

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article proposes a compression technique called LineCompress, which aims to reduce memory and communication overhead on resource-constrained IoT devices, thus improving the inference speed and energy efficiency of deep neural networks.

Machine learning techniques, particularly those based on neural networks, are always more often used at the edge of the network by Internet of Things (IoT) nodes. Unfortunately, the computation capabilities demanded by those applications, together with their energy efficiency-related constraints, exceed those exposed by embedded general-purpose processors. For this reason, the use of domain-specific hardware accelerators (DSAs) is considered the most viable solution to the unsustainable Turing tariff of general-purpose hardware. Starting from the observation that memory and communication traffic account for a large fraction of the overall latency and energy in deep neural network (DNN) inferences, this article proposes a new compression technique aimed at: 1) reducing the memory footprint for storing the model parameters of a DNN and 2) improving DNN inference latency and energy on resource-constrained IoT devices. The proposed compression technique, namely, LineCompress, is applied on a set of representative convolutional neural networks (CNNs) for object recognition mapped on a state-of-the-art DSA targeted for resource-constrained IoT devices. We show that on average, 7.4 x memory footprint reduction can be obtained, thus reducing the memory and communication traffic that result to 77% and 87% inference latency and energy reduction, respectively, trading-off efficiency versus accuracy.

DNN Model Compression for IoT Domain-Specific Hardware Accelerators

期刊

IEEE INTERNET OF THINGS JOURNAL

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

DNN Model Compression for IoT Domain-Specific Hardware Accelerators

期刊

IEEE INTERNET OF THINGS JOURNAL

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文