☆ 4.7 Article

Accelerating ML/DL Applications With Hierarchical Caching on Deduplication Storage Clusters

IEEE TRANSACTIONS ON BIG DATA (2022)

期刊

IEEE TRANSACTIONS ON BIG DATA

卷 8, 期 6, 页码 1622-1636

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TBDATA.2021.3106345

关键词

Training; Measurement; Deep learning; Layout; Memory; Big Data; Faces; Machine learning; deep learning; big data; storage management; deduplication

类别

Computer Science, Information Systems Computer Science, Theory & Methods

资金

Institute of Information & Communications Technology Planning & Evaluation (IITP) - Korea government (MSIT) [2021-002051]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper introduces Redup, which addresses the performance drop caused by enabling deduplication in ML/DL storage clusters. By introducing a Redup Caching Manager (RDCM), which provides a deduplication-aware caching mechanism, Redup achieves the acceleration of ML/DL read operations.

Large scale machine learning (ML) and deep learning (DL) platforms face challenges when integrated with deduplication enabled storage clusters. In the quest to achieve smart and efficient storage utilization, removal of duplicate data introduces bottlenecks, since deduplication alters the I/O transaction layout of the storage system. Therefore, it is critical to address such deduplication overhead for acceleration of ML/DL computation in deduplication storage. Existing state of the art ML/DL storage solutions such as Alluxio and AutoCache adopt non deduplication-aware caching mechanisms, which lacks the much needed performance boost when adopted in deduplication enabled ML/DL clusters. In this paper, we introduce Redup, which eliminates the performance drop caused by enabling deduplication in ML/DL storage clusters. At the core, is a Redup Caching Manager (RDCM), composed of a 2-tier deduplication layout-aware caching mechanism. The RDCM provides an abstraction of the underlying deduplication storage layout to ML/DL applications and provisions a decoupled acceleration of object reconstruction during ML/DL read operations. Our Redup evaluation shows negligible performance drop in ML/DL training performances as compared to a cluster without deduplication, whilst significantly outperforming Alluxio and AutoCache in terms of various performance metrics.

Accelerating ML/DL Applications With Hierarchical Caching on Deduplication Storage Clusters

期刊

IEEE TRANSACTIONS ON BIG DATA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Accelerating ML/DL Applications With Hierarchical Caching on Deduplication Storage Clusters

期刊

IEEE TRANSACTIONS ON BIG DATA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文