☆ 4.7 Article

Modeling throughput sampling size for a cloud-hosted data scheduling and optimization service

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2013)

Journal

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

Volume 29, Issue 7, Pages 1795-1807

Publisher

ELSEVIER

DOI: 10.1016/j.future.2013.01.003

Keywords

Distributed systems; Optimization; Network protocols; Distributed applications

Funding

Direct For Computer & Info Scie & Enginr
Division of Computing and Communication Foundations [1115805] Funding Source: National Science Foundation
Direct For Computer & Info Scie & Enginr
Office of Advanced Cyberinfrastructure (OAC) [0926701] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

As big-data processing and analysis dominates the usage of the Cloud systems, the need for Cloud-hosted data scheduling and optimization services increases. One key component for such a service is to provide available bandwidth and achievable throughput estimation capabilities, since all scheduling and optimization decisions would be built on top of this information. The biggest challenge in providing these estimation capabilities is the dynamic decision of what proportion of the actual dataset, when transferred, would give us an accurate estimate of the bandwidth and throughput achieved by transferring the whole data set. That proportion of data is called the sampling size (or the probe size). Although small fixed sample sizes worked well for high-latency low-bandwidth networks in the past, high-bandwidth networks require much larger and more dynamic sample sizes, since an accurate estimation now also depends on how fast the transfer protocol can saturate that fat network link. In this study, we present a model to decide the optimal sampling size based on the data size and estimated capacity of the network. Our results show that the predicted sampling size is very accurate compared to the targeted best sampling size for a certain file transfer in a majority of the cases.(C) 2013 Elsevier B.V. All rights reserved.

Modeling throughput sampling size for a cloud-hosted data scheduling and optimization service

Journal

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Modeling throughput sampling size for a cloud-hosted data scheduling and optimization service

Journal

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper