4.7 Article

Applications of provenance in performance prediction and data storage optimisation

Publisher

ELSEVIER
DOI: 10.1016/j.future.2017.01.003

Keywords

E-science; Workflow; Provenance; Machine learning

Ask authors/readers for more resources

Accurate and comprehensive storage of provenance information is a basic requirement for modern scientific computing. A significant effort in recent years has developed robust theories and standards for the representation of these traces across a variety of execution platforms. Whilst these are necessary to enable repeatability they do not exploit the captured information to its full potential. This data is increasingly being captured from applications hosted on Cloud Computing platforms, which offer large scale computing resources without significant up front costs. Medical applications, which generate large datasets are also suited to cloud computing as the practicalities of storing and processing such data locally are becoming increasingly challenging. This paper shows how provenance can be captured from medical applications, stored using a graph database and then used to answer audit questions and enable repeatability. This static provenance will then be combined with performance data to predict future workloads, inform decision makers and reduce latency. Finally, cost models which are based on real world cloud computing costs will be used to determine optimum strategies for data retention over potentially extended periods of time. (C) 2017 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available