4.5 Article

APPcache+: An STT-MRAM-Based Approximate Cache System With Low Power and Long Lifetime

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCAD.2023.3267713

Keywords

Approximation; cache; compression; encoding; image processing; spin transfer torque magnetic RAM (STT-MRAM); wear-leveling

Ask authors/readers for more resources

The traditional SRAM-based cache is not suitable for image processing applications due to high static power and low scalability. The emerging STT-MRAM is a promising alternative due to its low leakage power and high density. However, STT-MRAM suffers from high write energy. In this study, an STT-MRAM-based approximate cache architecture (APPcache+) is proposed to mitigate this problem by utilizing error tolerance in image processing applications. APPcache+ incorporates lightweight similarity-based encoding techniques, a partial read scheme, and a Ping-Pong intraline wear-leveling scheme to reduce energy consumption and improve lifetime. Evaluation results show that APPcache+ can significantly reduce energy by 32.58%, improve lifetime by 40.7% with only 2.2% performance degradation and 1.86% output quality loss compared to the baseline.
Due to high static power and low scalability, the traditional SRAM-based cache is not a good solution for image processing applications. Emerging spin transfer torque magnetic RAM (STT-MRAM) is a promising candidate for cache due to its low leakage power and high density. However, STT-MRAM suffers from high write energy. Therefore, by making use of the ability of tolerating minor errors in image processing applications, this work presents an STT-MRAM-based APProximate cache architecture (APPcache+) to write/read approximate data, which can largely reduce the cache energy and improve the STT-MRAM lifetime. APPcache+ includes three main designs. First, we find that there are many similar elements (e.g., pixels in images) in cache lines. Therefore, APPcache+ presents several lightweight similarity-based encoding techniques to remove redundant elements, thus, shortening the data size and reducing the energy of STT-MRAM cache. Second, we design a partial read scheme to reduce the read energy of the STT-MRAM cache. In the traditional decompression process, the whole line is fetched into the decompressor, leading to unnecessary read energy. The partial read scheme can largely reduce read energy while keeping the overhead low. Third, we observe the encoding schemes may lead to bit write imbalance. Therefore, we propose a lightweight Ping-Pong intraline wear-leveling scheme to improve the lifetime. Compared with the baseline, extensive evaluation results show that our APPcache+ can largely reduce the overall energy by 32.58%, improve lifetime by 40.7% with only 2.2% performance degradation, and 1.86% output quality loss.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available