☆ 4.7 Article

Bit-Line Computing for CNN Accelerators Co-Design in Edge AI Inference

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING (2023)

期刊

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING

卷 11, 期 2, 页码 358-372

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TETC.2023.3237914

关键词

Edge artificial intelligence; in-memory computing; hardware/software co-design; convolutional neural networks; low-power software optimization

类别

Computer Science, Information Systems Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Bit-line Computing (BC) architectures enable parallel execution of bit-wise operations in-memory, resulting in efficient arithmetic operations at the array periphery. This paradigm offers new opportunities for edge AI with its inherent parallelism and energy efficiency. This manuscript proposes a framework that leverages BC optimizations to enable high parallelism and aggressive compression of AI models, resulting in significant energy savings compared to state-of-the-art BC computing approaches.

By supporting the access of multiple memory words at the same time, Bit-line Computing (BC) architectures allow the parallel execution of bit-wise operations in-memory. At the array periphery, arithmetic operations are then derived with little additional overhead. Such a paradigm opens novel opportunities for Artificial Intelligence (AI) at the edge, thanks to the massive parallelism inherent in memory arrays and the extreme energy efficiency of computing in-situ, hence avoiding data transfers. Previous works have shown that BC brings disruptive efficiency gains when targeting AI workloads, a key metric in the context of emerging edge AI scenarios. This manuscript builds on these findings by proposing an end-to-end framework that leverages BC-specific optimizations to enable high parallelism and aggressive compression of AI models. Our approach is supported by a novel hardware module performing real-time decoding, as well as new algorithms to enable BC-friendly model compression. Our hardware/software approach results in a 91% energy savings (for a 1% accuracy degradation constraint) regarding state-of-the-art BC computing approaches.

Bit-Line Computing for CNN Accelerators Co-Design in Edge AI Inference

期刊

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Bit-Line Computing for CNN Accelerators Co-Design in Edge AI Inference

期刊

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文