☆ 3.8 Proceedings Paper

Iris: Automatic Generation of Efficient Data Layouts for High Bandwidth Utilization

2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC (2023)

Journal

2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC

Volume -, Issue -, Pages 172-177

Publisher

IEEE

DOI: 10.1145/3566097.3567892

Keywords

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Optimizing data movements is crucial in dealing with the challenges of data deluge and big data applications in heterogeneous computing. Although modern high-level synthesis (HLS) tools are efficient in optimizing computational aspects, there is still room for improvement in data transfers. Novel architectures, such as High-Bandwidth Memory with wider data busses, have been developed to address this issue. However, designers need to tailor their hardware/software interfaces to fully utilize the available bandwidth. We propose a methodology that automates the discovery and implementation of a data layout to maximize the available bandwidth when streaming data between memory and an accelerator.

Optimizing data movements is becoming one of the biggest challenges in heterogeneous computing to cope with data deluge and, consequently, big data applications. When creating specialized accelerators, modern high-level synthesis (HLS) tools are increasingly efficient in optimizing the computational aspects, but data transfers have not been adequately improved. To combat this, novel architectures such as High-Bandwidth Memory with wider data busses have been developed so that more data can be transferred in parallel. Designers must tailor their hardware/software interfaces to fully exploit the available bandwidth. HLS tools can automate this process, but the designer must follow strict coding-style rules. If the bus width is not evenly divisible by the data width (e.g., when using custom-precision data types) or if the arrays are not power-of-two length, the HLS-generated accelerator will likely not fully utilize the available bandwidth, demanding even more manual effort from the designer. We propose a methodology to automatically find and implement a data layout that, when streamed between memory and an accelerator, uses a higher percentage of the available bandwidth than a naive or HLS-optimized design. We borrow concepts from multiprocessor scheduling to achieve such high efficiency.

Iris: Automatic Generation of Efficient Data Layouts for High Bandwidth Utilization

Journal

2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC

Publisher

IEEE

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Iris: Automatic Generation of Efficient Data Layouts for High Bandwidth Utilization

Journal

2023 28TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC

Publisher

IEEE

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper