期刊
IEEE TRANSACTIONS ON COMPUTERS
卷 70, 期 8, 页码 1239-1252出版社
IEEE COMPUTER SOC
DOI: 10.1109/TC.2021.3062227
关键词
Internet of Things; Task analysis; Memory management; Hardware; Complexity theory; Pipelines; Optimization; Deep learning; convolutional neural networks; Internet-of-Things
The article presents a design methodology for allocating the execution of Convolutional Neural Networks in distributed IoT applications, aiming to minimize latency between data collection and decision-making under constraints on memory and processing load at the unit level.
Severe constraints on memory and computation characterizing the Internet-of-Things (IoT) units may prevent the execution of Deep Learning (DL)-based solutions, which typically demand large memory and high processing load. In order to support a real-time execution of the considered DL model at the IoT unit level, DL solutions must be designed having in mind constraints on memory and processing capability exposed by the chosen IoT technology. In this article, we introduce a design methodology aiming at allocating the execution of Convolutional Neural Networks (CNNs) on a distributed IoT application. Such a methodology is formalized as an optimization problem where the latency between the data-gathering phase and the subsequent decision-making one is minimized, within the given constraints on memory and processing load at the units level. The methodology supports multiple sources of data as well as multiple CNNs in execution on the same IoT system allowing the design of CNN-based applications demanding autonomy, low decision-latency, and high Quality-of-Service.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据