4.5 Article

Multi-Target Adaptive Reconfigurable Acceleration for Low-Power IoT Processing

期刊

IEEE TRANSACTIONS ON COMPUTERS
卷 70, 期 1, 页码 83-98

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TC.2020.2984736

关键词

Energy consumption; Program processors; Hardware; Acceleration; Fabrics; VLIW; Computer architecture; Adaptive; coarse-grained reconfigurable; CGRA; heterogeneous CMPs; big; LITTLE; accelerators; hardware; dynamic binary translation; architecture; performance; power; energy

向作者/读者索取更多资源

The article proposes extending a single-ISA heterogeneous CMP with CGRA and DBT modules for accelerating applications in different scenarios. It introduces an additional voltage rail for low-energy operation and leverages the structure features of CGRA to address implementation challenges of NTV computing. Performance and energy consumption are improved with less than 35% area overhead compared to the baseline CMP.
Low-power processors for the Internet-of-Things (IoT) demand a high degree of adaptability to efficiently execute applications with different resource requirements under varying scenarios. Current single-ISA heterogeneous Chip Multiprocessors (CMPs), such as ARM's big.LITTLE, provide multiple cores and voltage/frequency levels to address this challenge. However, finding the best possible type of core and the corresponding voltage/frequency level for all the execution scenarios, which involve different applications and phases, remains far from being reached. In this article, we propose extending such a single-ISA heterogeneous CMP with a Coarse-Grained Reconfigurable Array (CGRA) and a hardware-based dynamic binary translation (DBT) module that transparently maps application code onto the CGRA for acceleration. To achieve low-energy levels and efficiently manage the power consumption of the CGRA, we introduce an additional voltage rail that enables operation in the Near-Threshold Voltage (NTV) regime when needed, leveraging key features of the CGRA's structure to address the implementation challenges of NTV computing. For less than 35 percent area overhead to the baseline CMP, performance and energy consumption are improved as follows. Compared to: (a) power-efficient execution in the LITTLE core, MuTARe achieves 29 percent reduction in energy consumption, and 2x speedup; (b) performance-efficient execution in the big core, a speedup of 1.6x with an energy reduction of 41 percent is achieved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据