4.4 Article

High-Performance STT-MRAM-Based Computing-in-Memory Scheme Utilizing Data Read Feature

期刊

IEEE TRANSACTIONS ON NANOTECHNOLOGY
卷 22, 期 -, 页码 817-826

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNANO.2023.3336910

关键词

Computing-in-memory; STT-MRAM; boolean logics; BNN; XNOR-PC

向作者/读者索取更多资源

With the development of AI and BNN, the traditional computing system faces challenges in terms of memory and power. To improve computing efficiency, a CiM architecture is proposed, using STT-MRAM as the carrier. By optimizing the reading characteristics of STT-MRAM and modifying the peripheral circuitry, higher performance and lower energy consumption can be achieved.
With the development of Artificial Intelligence (AI) and Binary neural networks (BNN), the computing efficiency of the computing system is expected to be much better, however, enormous amounts of data processing have caused an intolerable 'memory wall' and 'power wall' challenge for traditional Von Neumann architectures. Therefore, more advanced Computing-in-memory (CiM) architectures are proposed. The emerging non-volatile memory Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM), with its fast access speed, near-zero leakage power consumption and high density is one of the most competitive carriers for CiM architectures.Due to the inherent read-write asymmetry of STT-MRAM, the read overhead is multiples smaller than the exaggerated write overhead. To better exploit the potential of MRAM, we focus on implementing an CiM based on the reading process.This work introduces the principle of CiM and proposes four basic logic operations (XNOR, XOR, AND and OR) based on STT-MRAM. Furthermore, a delay-based XNOR-popcount (XNOR-PC) operation for BNN based on XNOR operation is proposed. Incorporating the reading characteristics of STT-MRAM and slight modifications to the peripheral circuitry, these operations achieve significant optimisation in terms of performance and energy consumption. Simulation results show the proposed scheme can reduce the latency of XOR, AND, and OR operations at least by 99.3%, 82.2% and 80.2% compared with the existing design. The proposed XNOR-PC operation achieves at least a 59.4% reduction in power consumption and has a high degree of reliability. Also, Monte Carlo simulations prove the feasibility and robustness of the schemes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据