3.8 Article

EverAnalyzer: A Self-Adjustable Big Data Management Platform Exploiting the Hadoop Ecosystem

期刊

INFORMATION
卷 14, 期 2, 页码 -

出版社

MDPI
DOI: 10.3390/info14020093

关键词

Big Data; data management; data collection; data analysis; data processing; Hadoop; MapReduce; Spark; Mahout; MLlib

向作者/读者索取更多资源

This paper proposes EverAnalyzer, a self-adjustable Big Data management platform that utilizes multiple frameworks to address different data processing and analysis scenarios. By collecting data and utilizing metadata, the platform is able to recommend the best framework for users. Experimental results demonstrate that EverAnalyzer correctly suggests the optimum framework in the majority of cases.
Big Data is a phenomenon that affects today's world, with new data being generated every second. Today's enterprises face major challenges from the increasingly diverse data, as well as from indexing, searching, and analyzing such enormous amounts of data. In this context, several frameworks and libraries for processing and analyzing Big Data exist. Among those frameworks Hadoop MapReduce, Mahout, Spark, and MLlib appear to be the most popular, although it is unclear which of them best suits and performs in various data processing and analysis scenarios. This paper proposes EverAnalyzer, a self-adjustable Big Data management platform built to fill this gap by exploiting all of these frameworks. The platform is able to collect data both in a streaming and in a batch manner, utilizing the metadata obtained from its users' processing and analytical processes applied to the collected data. Based on this metadata, the platform recommends the optimum framework for the data processing/analytical activities that the users aim to execute. To verify the platform's efficiency, numerous experiments were carried out using 30 diverse datasets related to various diseases. The results revealed that EverAnalyzer correctly suggested the optimum framework in 80% of the cases, indicating that the platform made the best selections in the majority of the experiments.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据