☆ 4.2 Article

A GPU-based tabu search for very large hardware/software partitioning with limited resource usage

JOURNAL OF ADVANCED MECHANICAL DESIGN SYSTEMS AND MANUFACTURING (2017)

期刊

JOURNAL OF ADVANCED MECHANICAL DESIGN SYSTEMS AND MANUFACTURING

卷 11, 期 5, 页码 -

出版社

JAPAN SOC MECHANICAL ENGINEERS

DOI: 10.1299/jamdsm.2017jamdsm0060

关键词

Hardware/software co-design; Hardware/software partitioning; GPU-based tabu search; GPU resource-limitation; Time-space tradeoff

类别

Engineering, Manufacturing Engineering, Mechanical

资金

National Science Foundation of China [61472289]
Key Technology RAMP
D Program of Hubei Province [2014 BAA153]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In hardware/software (HW/SW) co-design, HW/SW partitioning is the most important step since it determines which components are implemented in hardware and which are implemented in software. Since most of HW/SW partitioning problems are NP hard, heuristic methods have to be utilized to solve them, especially for the large size problems. GPU-based heuristic methods to accelerate HW/SW co-design are a promising way to reduce run time. However, the existing methods cannot deal with very large embedded applications because of GPU resource limitations. This paper presents a method to overcome the GPU resource limitations for very large partitioning while keeping a reasonable runtime. First, at the stage of computing the costs of the candidates, we propose a fast method of 2-flipping computing for very large HW/SW co-design. Our method is also general and can deal with both odd and even numbers of nodes. More importantly, our method avoids utilizing double-precision arithmetic units, which are scarce resources in GPU architecture. Second, since the GPU is constrained by memory limitations and the costs of candidates cannot be directly stored in the GPU's global memory, we present a time-space tradeoff strategy to break memory limitations for very large HW/SW partitioning. In this way, the following steps can be run under the constraint of GPU's memory limitations. Third, an in-place removal of infeasible solutions is proposed to reduce the overhead of global memory by half when the neighborhood is compacted. Fourth, when evaluating the tabu status of feasible candidates, we present a bitwise representation of tabu status to minimize the transfer overhead. Finally, we conduct a number of experiments. The results show that the proposed 2-flipping method of single precision data types works well. The results also demonstrate that the proposed approach expands the number of nodes of the task graph from 10,000 to 30,000 under the limitation of the GPU's global memory of 6 GB. The correlations between compression intensity and solution quality are analyzed to ensure the fairness and soundness of our method. Our work is general and can provide guidance for other applications.

A GPU-based tabu search for very large hardware/software partitioning with limited resource usage

期刊

JOURNAL OF ADVANCED MECHANICAL DESIGN SYSTEMS AND MANUFACTURING

出版社

JAPAN SOC MECHANICAL ENGINEERS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A GPU-based tabu search for very large hardware/software partitioning with limited resource usage

期刊

JOURNAL OF ADVANCED MECHANICAL DESIGN SYSTEMS AND MANUFACTURING

出版社

JAPAN SOC MECHANICAL ENGINEERS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文