4.7 Article

The Doctrine of MEAN: Realizing Deduplication Storage at Unreliable Edge

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPDS.2023.3305460

关键词

Deduplication; fault tolerance; storage system; edge computing

向作者/读者索取更多资源

This article proposes a deduplication-enabled storage system called MEAN that uses unreliable resources at the network edge. MEAN places similar files together for better deduplication and maintains replicas of popular files for higher reliability. The author formulates the problem, proves its NP-hardness, and provides efficient heuristics based on similarity-aware hierarchical clustering. Performance evaluation using a real-world dataset shows that MEAN improves the file hit ratio by 77% and reduces file retrieval delay by up to 71% compared to existing methods.
Placing popular data at the network edge helps reduce the retrieval latency, but it also brings challenges to the limited edge storage space. Currently, using available yet not necessarily reliable edge resources is common sense for edge space expansion, while deploying deduplication storage strategies is a general method for better space utilization. However, a contradiction arises when jointly implementing data deduplication with unreliable edge resources. On the one hand, the deduplication policy stipulates that any data chunk can be stored exactly once; on the other hand, the use of unreliable resources imposes that data should be backed up for the seek of file availability. To resolve such contradiction, we propose MEAN, a deduplication-enabled storage system using unreliable resources at the network edge. The core idea of MEAN is to place similar files together for better deduplication and maintain replicas of popular files for higher reliability. We first formulate this problem and prove its NP-hardness, then provide efficient heuristics based on similarity-aware hierarchical clustering. Three different reliability scenarios are comprehensively considered to develop our algorithms. We also implement a prototype system and evaluate the performance of MEAN with a real-world dataset. The results show that MEAN can fortify the file hit ratio under unreliable environments by 77% while reducing the file retrieval delay up to 71%, compared with the state-of-the-art approach.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据