☆ 4.2 Article

Core Placement Optimization for Multi-chip Many-core Neural Network Systems with Reinforcement Learning

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS (2021)

Journal

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS

Volume 26, Issue 2, Pages -

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3418498

Keywords

Multi-chip many-core architecture; neural network accelerator; core placement optimization; machine learning for system

Funding

NSF [1725447, 1719160, 1730309]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study proposes a reinforcement-learning-based method to optimize core placement in multi-chip many-core systems, improving system performance and efficiency. Experimental results demonstrate significant improvements in throughput and latency compared to traditional methods.

Multi-chip many-core neural network systems are capable of providing high parallelism benefited from decentralized execution, and they can be scaled to very large systems with reasonable fabrication costs. As multi-chip many-core systems scale up, communication latency related effects will take a more important portion in the system performance. While previous work mainly focuses on the core placement within a single chip, there are two principal issues still unresolved: the communication-related problems caused by the non-uniform, hierarchical on/off-chip communication capability in multi-chip systems, and the scalability of these heuristic-based approaches in a factorially growing search space. To this end, we propose a reinforcement-learning-based method to automatically optimize core placement through deep deterministic policy gradient, taking into account information of the environment by performing a series of trials (i.e., placements) and using convolutional neural networks to extract spatial features of different placements. Experimental results indicate that compared with a naive sequential placement, the proposed method achieves 1.99x increase in throughput and 50.5% reduction in latency; compared with the simulated annealing, an effective technique to approximate the global optima in an extremely large search space, our method improves the throughput by 1.22x and reduces the latency by 18.6%. We further demonstrate that our proposed method is capable to find optimal placements taking advantages of different communication properties caused by different system configurations, and work in a topology-agnostic manner.

Core Placement Optimization for Multi-chip Many-core Neural Network Systems with Reinforcement Learning

Journal

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Core Placement Optimization for Multi-chip Many-core Neural Network Systems with Reinforcement Learning

Journal

ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper