4.6 Article

A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Proceedings Paper Computer Science, Hardware & Architecture

A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and Inference

Jinwook Oh et al.

2020 IEEE SYMPOSIUM ON VLSI CIRCUITS (2020)

Article Engineering, Electrical & Electronic

A 16-nm Always-On DNN Processor With Adaptive Clocking and Multi-Cycle Banked SRAMs

Sae Kyu Lee et al.

IEEE JOURNAL OF SOLID-STATE CIRCUITS (2019)

Article Computer Science, Hardware & Architecture

DEEPTOOLS: Compiler and Execution Runtime Extensions for RAPiD AI Accelerator

Swagath Venkataramani et al.

IEEE MICRO (2019)

Proceedings Paper Computer Science, Artificial Intelligence

DLFloat: A 16-b Floating Point format designed for Deep Learning Training and Inference

Ankur Agrawal et al.

2019 IEEE 26TH SYMPOSIUM ON COMPUTER ARITHMETIC (ARITH) (2019)

Proceedings Paper Engineering, Electrical & Electronic

A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm

Brian Zimmer et al.

2019 SYMPOSIUM ON VLSI CIRCUITS (2019)

Article Engineering, Electrical & Electronic

DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications

Paul N. Whatmough et al.

IEEE JOURNAL OF SOLID-STATE CIRCUITS (2018)

Proceedings Paper Computer Science, Artificial Intelligence

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi et al.

44TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2017) (2017)