4.5 Article

Measuring data-centre workflows complexity through process mining: the Google cluster case

Journal

JOURNAL OF SUPERCOMPUTING
Volume 76, Issue 4, Pages 2449-2478

Publisher

SPRINGER
DOI: 10.1007/s11227-019-02996-2

Keywords

Cloud computing; Business process management; Scheduling; Process mining; Process discovery; High performance computing

Funding

  1. Ministry of Science and Technology of Spain, ECLIPSE project [RTI2018-094283-B-C33]
  2. Ministry of Science and Technology of Spain, REACT project [RTI2018-098062-A-I00]
  3. University of Seville, VI Plan Propio de Investigacion y Transferencia -US 2018, Proyecto Precompetitivo [2018/00000520]
  4. Catedra de Telefonica Inteligencia en la Red of the Universidad de Sevilla

Ask authors/readers for more resources

Data centres have become the backbone of large Cloud services and applications, providing virtually unlimited elastic and scalable computational and storage resources. The search for the efficiency and optimisation of resources is one of the current key aspects for large Cloud Service Providers and is becoming more and more challenging, since new computing paradigms such as Internet of Things, Cyber-Physical Systems and Edge Computing are spreading. One of the key aspects to achieve efficiency in data centres consists of the discovery and proper analysis of the data-centre behaviour. In this paper, we present a model to automatically retrieve execution workflows of existing data-centre logs by employing process mining techniques. The discovered processes are characterised and analysed according to the understandability and complexity in terms of execution efficiency of data-centre jobs. We finally validate and demonstrate the usability of the proposal by applying the model in a real scenario, that is, the Google Cluster traces.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available