4.1 Article

Survey of Distributed Computing Frameworks for Supporting Big Data Analysis

Related references

Note: Only part of the references are listed.
Article Computer Science, Information Systems

TiDB: A Raft-based HTAP Database

Dongxu Huang et al.

PROCEEDINGS OF THE VLDB ENDOWMENT (2020)

Proceedings Paper Computer Science, Hardware & Architecture

Distributed and Parallel Ensemble Classification for Big Data Based on Kullback-Leibler Random Sample Partition

Chenghao Wei et al.

ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT I (2020)

Proceedings Paper Computer Science, Information Systems

Big data and Spark: Comparison with Hadoop

Yassine Benlachmi et al.

PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020) (2020)

Article Computer Science, Information Systems

Distributed Data Strategies to Support Large-Scale Data Analysis Across Geo-Distributed Data Centers

Tamer Z. Emara et al.

IEEE ACCESS (2020)

Article Computer Science, Artificial Intelligence

A Survey of Data Partitioning and Sampling Methods to Support Big Data Analysis

Mohammad Sultan Mahmud et al.

BIG DATA MINING AND ANALYTICS (2020)

Article Statistics & Probability

Sampling Techniques for Big Data Analysis

Jae Kwang Kim et al.

INTERNATIONAL STATISTICAL REVIEW (2019)

Article Computer Science, Software Engineering

A distributed data management system to support large-scale data analysis

Tamer Z. Emara et al.

JOURNAL OF SYSTEMS AND SOFTWARE (2019)

Article Automation & Control Systems

Random Sample Partition: A Distributed Data Model for Big Data Analysis

Salman Salloum et al.

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS (2019)

Article Computer Science, Information Systems

Wireless MapReduce Distributed Computing

Fan Li et al.

IEEE TRANSACTIONS ON INFORMATION THEORY (2019)

Article Computer Science, Information Systems

An Asymptotic Ensemble Learning Framework for Big Data Analysis

Salman Salloum et al.

IEEE ACCESS (2019)

Article Computer Science, Theory & Methods

Exploring and cleaning big data with random sample data blocks

Salman Salloum et al.

JOURNAL OF BIG DATA (2019)

Proceedings Paper Computer Science, Hardware & Architecture

TensorFlow on state-of-the-art HPC clusters: a machine learning use case

Guillem Ramirez-Gargallo et al.

2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) (2019)

Article Computer Science, Information Systems

H2Hadoop: Improving Hadoop Performance Using the Metadata of Related Jobs

Hamoud Alshammari et al.

IEEE TRANSACTIONS ON CLOUD COMPUTING (2018)

Article Computer Science, Theory & Methods

HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

Marco A. S. Netto et al.

ACM COMPUTING SURVEYS (2018)

Article Computer Science, Software Engineering

Efficient Parallel Random Sampling-Vectorized, Cache-Efficient, and Online

Peter Sanders et al.

ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE (2018)

Article Computer Science, Hardware & Architecture

Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling

Elaheh Gavagsaz et al.

JOURNAL OF SUPERCOMPUTING (2018)

Review Computer Science, Artificial Intelligence

Ensemble learning: A survey

Omer Sagi et al.

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY (2018)

Article Computer Science, Information Systems

DLoBD: A Comprehensive Study of Deep Learning over Big Data Stacks on HPC Clusters

Xiaoyi Lu et al.

IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS (2018)

Article Computer Science, Theory & Methods

Distributed stream clustering using micro-clusters on Apache Storm

Pasan Karunaratne et al.

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING (2017)

Article Computer Science, Information Systems

Big Data Processing Stacks

Sherif Sakr

IT Professional (2017)

Proceedings Paper Computer Science, Information Systems

I-sampling: A New Block-Based Sampling Method for Large-Scale Dataset

Yulin He et al.

2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017) (2017)

Article Computer Science, Information Systems

State Management in Apache Flink® Consistent Stateful Distributed Stream Processing

Paris Carbone et al.

PROCEEDINGS OF THE VLDB ENDOWMENT (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Empirical Analysis of Asymptotic Ensemble Learning for Big Data

Salman Salloum et al.

2016 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES (BDCAT) (2016)

Proceedings Paper Computer Science, Theory & Methods

Apache Flink: Stream Analytics at Scale

Asterios Katsifodimos et al.

2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING WORKSHOP (IC2EW) (2016)

Article Computer Science, Theory & Methods

Performance Optimization for Managing Massive Numbers of Small Files in Distributed File Systems

Songling Fu et al.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2015)

Proceedings Paper Computer Science, Hardware & Architecture

Hadoop, MapReduce and HDFS: A Developers Perspective

Mohd Rehan Ghazi et al.

INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015) (2015)

Proceedings Paper Computer Science, Interdisciplinary Applications

A Survey on Distributed File System Technology

J. Blomer

16TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2014) (2015)

Proceedings Paper Computer Science, Information Systems

Spark SQL: Relational Data Processing in Spark

Michael Armbrust et al.

SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (2015)

Review Computer Science, Hardware & Architecture

A comprehensive view of Hadoop research-A systematic literature review

Ivanilton Polato et al.

JOURNAL OF NETWORK AND COMPUTER APPLICATIONS (2014)

Article Computer Science, Theory & Methods

SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters

Rong Gu et al.

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING (2014)

Review Multidisciplinary Sciences

Challenges of Big Data analysis

Jianqing Fan et al.

NATIONAL SCIENCE REVIEW (2014)

Proceedings Paper Computer Science, Artificial Intelligence

Sampling for Big Data: A Tutorial

Graham Cormode et al.

PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14) (2014)

Proceedings Paper Computer Science, Hardware & Architecture

Parallel and distributed computing: Memories of Time Past and a Glimpse at the Future

Dan C. Marinescu

2014 IEEE 13TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC) (2014)

Article Computer Science, Information Systems

iMapReduce: A Distributed Computing Framework for Iterative Computation

Yanfeng Zhang et al.

JOURNAL OF GRID COMPUTING (2012)

Article Computer Science, Hardware & Architecture

The HaLoop approach to large-scale iterative data analysis

Yingyi Bu et al.

VLDB JOURNAL (2012)

Article Computer Science, Hardware & Architecture

The TianHe-1A Supercomputer: Its Hardware and Software

Xue-Jun Yang et al.

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY (2011)

Article Computer Science, Hardware & Architecture

MapReduce: A Flexible Data Processing Tool

Jeffrey Dean et al.

COMMUNICATIONS OF THE ACM (2010)

Article Computer Science, Hardware & Architecture

DFS: A File System for Virtualized Flash Storage

William K. Josephson et al.

ACM TRANSACTIONS ON STORAGE (2010)

Article Computer Science, Information Systems

Hive - A Warehousing Solution Over a Map-Reduce Framework

Ashish Thusoo et al.

PROCEEDINGS OF THE VLDB ENDOWMENT (2009)

Article Computer Science, Information Systems

Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience

Alan F. Gates et al.

PROCEEDINGS OF THE VLDB ENDOWMENT (2009)

Article Computer Science, Hardware & Architecture

Mapreduce: Simplified data processing on large clusters

Jeffrey Dean et al.

COMMUNICATIONS OF THE ACM (2008)

Article Computer Science, Theory & Methods

A high performance algorithm for static task scheduling in heterogeneous distributed computing systems

Mohammad I. Daoud et al.

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING (2008)

Article Computer Science, Software Engineering

A taxonomy and survey of grid resource management systems for distributed computing

K Krauter et al.

SOFTWARE-PRACTICE & EXPERIENCE (2002)