☆ 4.2 Article

GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks

IEEE COMPUTER ARCHITECTURE LETTERS (2022)

期刊

IEEE COMPUTER ARCHITECTURE LETTERS

卷 21, 期 2, 页码 45-48

出版社

IEEE COMPUTER SOC

DOI: 10.1109/LCA.2022.3182387

关键词

Random access memory; Bandwidth; Sparse matrices; Performance evaluation; System-on-chip; Registers; Memory management; Near-data processing; DRAM; graph convolutional networks

类别

Computer Science, Hardware & Architecture

资金

National Research Foundation of Korea (NRF)
Korea government (MSIT) [NRF-2018R1A5A1059921]
Institute of Information & Communications Technology Planning & Evaluation (IITP)
Korea government (MSIT) under Artificial Intelligence Graduate School Program (Seoul National University) [2021-0-01343]
Inha University Research Grant

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Graph Convolutional Network (GCN) models have high accuracy in interpreting graph data, with one of the key components being the aggregation operation. A proposed new architecture, GraNDe, accelerates memory-intensive aggregation operations and achieves a speedup of up to 4.3x on open-graph benchmark datasets compared to baseline systems.

Graph Convolutional Network (GCN) models have attracted attention given their high accuracy in interpreting graph data. One of the primary building blocks of a GCN model is aggregation, which gathers and averages the feature vectors corresponding to the vertices adjacent to each individual vertex. Aggregation works by multiplying the adjacency and feature matrices. The size of both matrices exceeds the on-chip cache capacity, and the adjacency matrix is highly sparse. These lead to little data reuse and cause numerous main-memory accesses during the aggregation process. Thus, aggregation exhibits memory-intensive characteristics. We propose GraNDe, an NDP architecture that accelerates memory-intensive aggregation operations by locating processing elements near the DRAM datapath to exploit rank-level parallelism. By exploring the data mapping of the operand matrices to DRAM ranks, we discover that the optimal mapping differs depending on the configuration of a specific GCN layer. With our optimal layer-by-layer mapping scheme, GraNDe shows a speedup up to 4.3x compared to the baseline system on open-graph benchmark datasets.

GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks

期刊

IEEE COMPUTER ARCHITECTURE LETTERS

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks

期刊

IEEE COMPUTER ARCHITECTURE LETTERS

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文