4.5 Article

ALPINE: Analog In-Memory Acceleration With Tight Processor Integration for Deep Learning

期刊

IEEE TRANSACTIONS ON COMPUTERS
卷 72, 期 7, 页码 1985-1998

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TC.2022.3230285

关键词

Hardware; Computational modeling; Computer architecture; Biological system modeling; In-memory computing; Reduced instruction set computing; Recurrent neural networks; AI accelerators; architectural exploration; artificial neural networks; gem5; neuromorphic computing

向作者/读者索取更多资源

Analog in-memory computing (AIMC) cores offer better performance and energy efficiency for neural network inference compared to digital logic(CPUs). However, AIMC-centric platforms lack flexibility and can only support a limited set of processing functions. To bridge this flexibility gap, we propose a novel system architecture that integrates analog in-memory computing accelerators into multi-core CPUs in general-purpose systems.
Analog in-memory computing (AIMC) cores offers significant performance and energy benefits for neural network inference with respect to digital logic (e.g., CPUs). AIMCs accelerate matrix-vector multiplications, which dominate these applications' run-time. However, AIMC-centric platforms lack the flexibility of general-purpose systems, as they often have hard-coded data flows and can only support a limited set of processing functions. With the goal of bridging this gap in flexibility, we present a novel system architecture that tightly integrates analog in-memory computing accelerators into multi-core CPUs in general-purpose systems. We developed a powerful gem5-based full system-level simulation framework into the gem5-X simulator, ALPINE, which enables an in-depth characterization of the proposed architecture. ALPINE allows the simulation of the entire computer architecture stack from major hardware components to their interactions with the Linux OS. Within ALPINE, we have defined a custom ISA extension and a software library to facilitate the deployment of inference models. We showcase and analyze a variety of mappings of different neural network types, and demonstrate up to 20.5x/20.8x performance/energy gains with respect to a SIMD-enabled ARM CPU implementation for convolutional neural networks, multi-layer perceptrons, and recurrent neural networks.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据