☆ 3.8 Article

EverAnalyzer: A Self-Adjustable Big Data Management Platform Exploiting the Hadoop Ecosystem

INFORMATION (2023)

期刊

INFORMATION

卷 14, 期 2, 页码 -

出版社

MDPI

DOI: 10.3390/info14020093

关键词

Big Data; data management; data collection; data analysis; data processing; Hadoop; MapReduce; Spark; Mahout; MLlib

类别

Computer Science, Information Systems

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes EverAnalyzer, a self-adjustable Big Data management platform that utilizes multiple frameworks to address different data processing and analysis scenarios. By collecting data and utilizing metadata, the platform is able to recommend the best framework for users. Experimental results demonstrate that EverAnalyzer correctly suggests the optimum framework in the majority of cases.

Big Data is a phenomenon that affects today's world, with new data being generated every second. Today's enterprises face major challenges from the increasingly diverse data, as well as from indexing, searching, and analyzing such enormous amounts of data. In this context, several frameworks and libraries for processing and analyzing Big Data exist. Among those frameworks Hadoop MapReduce, Mahout, Spark, and MLlib appear to be the most popular, although it is unclear which of them best suits and performs in various data processing and analysis scenarios. This paper proposes EverAnalyzer, a self-adjustable Big Data management platform built to fill this gap by exploiting all of these frameworks. The platform is able to collect data both in a streaming and in a batch manner, utilizing the metadata obtained from its users' processing and analytical processes applied to the collected data. Based on this metadata, the platform recommends the optimum framework for the data processing/analytical activities that the users aim to execute. To verify the platform's efficiency, numerous experiments were carried out using 30 diverse datasets related to various diseases. The results revealed that EverAnalyzer correctly suggested the optimum framework in 80% of the cases, indicating that the platform made the best selections in the majority of the experiments.

EverAnalyzer: A Self-Adjustable Big Data Management Platform Exploiting the Hadoop Ecosystem

期刊

INFORMATION

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

EverAnalyzer: A Self-Adjustable Big Data Management Platform Exploiting the Hadoop Ecosystem

期刊

INFORMATION

出版社

MDPI

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文