4.7 Article

Dynamic Pricing and Placing for Distributed Machine Learning Jobs: An Online Learning Approach

Journal

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS
Volume 41, Issue 4, Pages 1135-1150

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSAC.2023.3242707

Keywords

Pricing; Runtime; Cloud computing; Servers; Heuristic algorithms; Dynamic scheduling; Costs; Machine learning; dynamic pricing; online placement

Ask authors/readers for more resources

Nowadays, distributed machine learning jobs often use a parameter server framework for training models over large-scale datasets. Existing cloud pricing policies do not suitably account for the stochastic runtime of distributed ML jobs. To address this, we propose a dynamic pricing and placement algorithm (DPS) that maximizes the cloud service provider's profit by dynamically calculating the unit resource price upon job arrival and determining optimal job placement.
Nowadays distributed machine learning (ML) jobs usually adopt a parameter server (PS) framework to train models over large-scale datasets. Such ML job deploys hundreds of concurrent workers, and model parameter updates are exchanged frequently between workers and PSs. Current practice is that workers and PSs may be placed on different physical servers, bringing uncertainty in jobs' runtime. Existing cloud pricing policy often charges a fixed price according to the job's runtime. Although this pricing strategy is simple to implement, such pricing mechanism is not suitable for distributed ML jobs whose runtime is stochastic and can only be estimated according to its placement after job admission. To supplement existing cloud pricing schemes, we design a dynamic pricing and placement algorithm, DPS, for distributed ML jobs. DPS aims to maximize the cloud service provider's profit, which dynamically calculates unit resource price upon a job's arrival, and determines job's placement to minimize its runtime if offered price is accepted to users. Our design exploits the multi-armed bandit (MAB) technique to learn unknown information based on past sales. DPS balances the exploration and exploitation stage, and selects the best price based on the reward which is related to job runtime. Our learning-based algorithm can increase the provider's profit by 200%, and achieves a sub-linear regret with both the time horizon and the total job number, compared to benchmark pricing schemes. Extensive evaluations using real-world data also validates the efficacy of DPS.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available