☆ 4.6 Article

MULTS: A multi-cloud fault-tolerant architecture to manage transient servers in cloud computing

JOURNAL OF SYSTEMS ARCHITECTURE (2019)

Journal

JOURNAL OF SYSTEMS ARCHITECTURE

Volume 101, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.sysarc.2019.101651

Keywords

Cloud computing; Fault tolerance; Checkpoint; Machine learning; Resilient architecture; Spot instance; Survival analysis

Funding

Brazilian Coordination for the Improvement of Higher Education Personnel (CAPES) [1441250]
Brazilian National Council for Scientific and Technological Development (CNPq) [311301/2018-5]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The large-scale utilization of cloud computing resources has led to the emergence of cloud environment reliability as an important issue. In addition, cloud providers are negotiating unreliable virtual machines as a result of exploring unused resources offering them as transient servers - a lower price virtual machine service with resource revocations without user intervention. To increase the availability of transient servers, we propose a multi-cloud fault-tolerant architecture to provide a resilient environment using a scenario-based optimal checkpoint in a scheme to guarantee running processes with reduced user costs. The architecture combines a heuristic to extract information from a case-based reasoning and a statistical model to predict failure events helping to refine fault tolerance parameters. As a result, a cloud environment with better levels of reliability and reduced execution time is provided. Extensive simulations show high levels of accuracy reaching up to 92% survival prediction success rate and a gain of 74,58% of execution time reduction for long running applications. The results are promising, indicating that the proposed architecture can prevent revocation failures under realistic working conditions.

MULTS: A multi-cloud fault-tolerant architecture to manage transient servers in cloud computing

Journal

JOURNAL OF SYSTEMS ARCHITECTURE

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

MULTS: A multi-cloud fault-tolerant architecture to manage transient servers in cloud computing

Journal

JOURNAL OF SYSTEMS ARCHITECTURE

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper