期刊
ELECTRONICS
卷 10, 期 3, 页码 -出版社
MDPI
DOI: 10.3390/electronics10030230
关键词
artificial intelligence (AI); binary neural network (BNN); FPGA; machine learning; pattern recognition; VLSI
资金
- Institute of Information & communications Technology Planning & Evaluation (IITP) - Korean government (MSIT) [2019-0-00056, 2020-0-00201]
- IDEC
A BNN accelerator with adaptive parallelism is proposed, offering high throughput performance and higher area-speed efficiency. By analyzing target layer parameters and operating with optimal parallelism using reasonable resources, this accelerator can achieve high efficiency in all layers.
Binary neural networks (BNNs) have attracted significant interest for the implementation of deep neural networks (DNNs) on resource-constrained edge devices, and various BNN accelerator architectures have been proposed to achieve higher efficiency. BNN accelerators can be divided into two categories: streaming and layer accelerators. Although streaming accelerators designed for a specific BNN network topology provide high throughput, they are infeasible for various sensor applications in edge AI because of their complexity and inflexibility. In contrast, layer accelerators with reasonable resources can support various network topologies, but they operate with the same parallelism for all the layers of the BNN, which degrades throughput performance at certain layers. To overcome this problem, we propose a BNN accelerator with adaptive parallelism that offers high throughput performance in all layers. The proposed accelerator analyzes target layer parameters and operates with optimal parallelism using reasonable resources. In addition, this architecture is able to fully compute all types of BNN layers thanks to its reconfigurability, and it can achieve a higher area-speed efficiency than existing accelerators. In performance evaluation using state-of-the-art BNN topologies, the designed BNN accelerator achieved an area-speed efficiency 9.69 times higher than previous FPGA implementations and 24% higher than existing VLSI implementations for BNNs.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据