☆ 4.1 Article

Survey of Distributed Computing Frameworks for Supporting Big Data Analysis

BIG DATA MINING AND ANALYTICS (2023)

期刊

BIG DATA MINING AND ANALYTICS

卷 6, 期 2, 页码 154-169

出版社

TSINGHUA UNIV PRESS

DOI: 10.26599/BDMA.2022.9020014

关键词

Analytical models; Costs; Computational modeling; Clustering algorithms; Distributed databases; Big Data; Programming; distributed computing frameworks; big data analysis; approximate computing; MapReduce computing model

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Distributed computing frameworks are essential for efficient processing of big data. The current MapReduce model is inadequate for complex analysis tasks on terabytes of data. New frameworks are needed to overcome these challenges.

Distributed computing frameworks are the fundamental component of distributed computing systems. They provide an essential way to support the efficient processing of big data on clusters or cloud. The size of big data increases at a pace that is faster than the increase in the big data processing capacity of clusters. Thus, distributed computing frameworks based on the MapReduce computing model are not adequate to support big data analysis tasks which often require running complex analytical algorithms on extremely big data sets in terabytes. In performing such tasks, these frameworks face three challenges: computational inefficiency due to high I/O and communication costs, non-scalability to big data due to memory limit, and limited analytical algorithms because many serial algorithms cannot be implemented in the MapReduce programming model. New distributed computing frameworks need to be developed to conquer these challenges. In this paper, we review MapReduce-type distributed computing frameworks that are currently used in handling big data and discuss their problems when conducting big data analysis. In addition, we present a non-MapReduce distributed computing framework that has the potential to overcome big data analysis challenges.

Survey of Distributed Computing Frameworks for Supporting Big Data Analysis

期刊

BIG DATA MINING AND ANALYTICS

出版社

TSINGHUA UNIV PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Survey of Distributed Computing Frameworks for Supporting Big Data Analysis

期刊

BIG DATA MINING AND ANALYTICS

出版社

TSINGHUA UNIV PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文