☆ 4.6 Article

Efficient Large-Capacity Caching in Cloud Storage Using Skip-Gram-Based File Correlation Analysis

IEEE ACCESS (2023)

期刊

IEEE ACCESS

卷 11, 期 -, 页码 111265-111273

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2023.3322725

关键词

Cache strategy; cloud storage; file correlation; hit rate; machine learning; prefetching

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Designing a high-capacity cache is crucial for improving accessibility of cloud storage. This study introduces a file similarity strategy based on skip-gram to optimize caching and prefetching in cloud storage. By judging the correlation between files, this strategy allows for efficient prefetching and replacement in the cache. The use of this prefetching strategy significantly improves cache hit rate and consumes minimal time during online operations.

Designing a high-capacity cache is an essential means of improving the accessibility of cloud storage. Compared with traditional data access, cloud storage data access presents new patterns, and traditional caching strategies cannot handle the prefetching and replacement of non-hot data very well. Numerous studies have shown that file correlation can optimize cloud storage's caching and prefetching strategies. However, characterizing the correlation between files from multiple dimensions is quite complex, and the difficulty of optimizing cloud storage caching using file correlation increases accordingly. Based on the above shortcomings, this study designed a file similarity strategy based on skip-gram from the analysis of user access. This strategy completes the prefetching and replacing files in a high-capacity cache by judging the correlation between files. The strategy prefetches files and dynamically inserts them into the cache by judging the correlation between files. After using the prefetching strategy, we significantly improve the cache hit rate in the simulation benchmark. In addition, the strategy can establish an index table after each training completion, which consumes very little time during online operations. During training, the time required to establish the index is $O(N*log(V))$ , and the time for indexing is $O(1)$ .

Efficient Large-Capacity Caching in Cloud Storage Using Skip-Gram-Based File Correlation Analysis

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Efficient Large-Capacity Caching in Cloud Storage Using Skip-Gram-Based File Correlation Analysis

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文