4.3 Article

GPU Domain Specialization via Composable On-Package Architecture

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Proceedings Paper Computer Science, Hardware & Architecture

Ten Lessons From Three Generations Shaped Google's TPUv4i Industrial Product

Norman P. Jouppi et al.

Summary: The passage summarizes the lessons learned from Google's deployment of several TPU generations since 2015, highlighting the importance of semiconductor technology advancement, compiler compatibility, cost considerations, multi-tenancy support, and how these lessons culminated in the development of TPUv4i since 2020.

2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021) (2021)

Article Computer Science, Hardware & Architecture

A Domain-Specific Supercomputer for Training Deep Neural Networks

Norman P. Jouppi et al.

COMMUNICATIONS OF THE ACM (2020)

Proceedings Paper Computer Science, Hardware & Architecture

An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives

Benjamin Klenk et al.

2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020) (2020)

Article Engineering, Electrical & Electronic

Zeppelin: An SoC for Multichip Architectures

Thomas Burd et al.

IEEE JOURNAL OF SOLID-STATE CIRCUITS (2019)

Article Computer Science, Hardware & Architecture

Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training

Saptadeep Pal et al.

IEEE MICRO (2019)

Proceedings Paper Computer Science, Theory & Methods

3D NoCs with Active Interposer for Multi-Die Systems

Vasil Pano et al.

PROCEEDINGS OF THE 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS'19) (2019)

Proceedings Paper Computer Science, Hardware & Architecture

Accelerating Distributed Reinforcement Learning with In-Switch Computing

Youjie Li et al.

PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19) (2019)

Proceedings Paper Computer Science, Hardware & Architecture

Combining HW/SW Mechanisms to Improve NUMA Performance of Multi-GPU Systems

Vinson Young et al.

2018 51ST ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO) (2018)

Article Engineering, Electrical & Electronic

Wafer-Level Integration of an Advanced Logic-Memory System Through the Second-Generation CoWoS Technology

S. Y. Hou et al.

IEEE TRANSACTIONS ON ELECTRON DEVICES (2017)

Proceedings Paper Computer Science, Information Systems

Neural Collaborative Filtering

Xiangnan He et al.

PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17) (2017)

Article Computer Science, Hardware & Architecture

DESIGNING EFFICIENT HETEROGENEOUS MEMORY ARCHITECTURES

Evgeny Bolotin et al.

IEEE MICRO (2015)

Article Chemistry, Multidisciplinary

An overview of the Amber biomolecular simulation package

Romelia Salomon-Ferrer et al.

WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE (2013)

Article Biochemistry & Molecular Biology

RELION: Implementation of a Bayesian approach to cryo-EM structure determination

Sjors H. W. Scheres

JOURNAL OF STRUCTURAL BIOLOGY (2012)

Article Mechanics

Fluid-solid coupling on a cluster of GPU graphics cards for seismic wave propagation

Dimitri Komatitsch

COMPTES RENDUS MECANIQUE (2011)

Article Computer Science, Hardware & Architecture

GPUS AND THE FUTURE OF PARALLEL COMPUTING

Stephen W. Keckler et al.

IEEE MICRO (2011)