4.7 Article

Trading Cost and Throughput in Geo-Distributed Analytics With A Two Time Scale Approach

期刊

IEEE TRANSACTIONS ON CLOUD COMPUTING
卷 10, 期 3, 页码 2163-2177

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCC.2020.2994195

关键词

Geo-distributed data analytics; data placement; admission control; lyapunov optimization; two-timescale approach

资金

  1. NSFC General Technology Basic Research Joint Funds [U1836214]
  2. State Key Program of National Natural Science of China [61832013]
  3. Artificial Intelligence Science and Technology Major Project of Tianjin [18ZXZNGX00190]
  4. National Key R&D Program of China [2019QY1302, 2019YFB2102404]
  5. NSFC [61672379, 61872265, 61672131]
  6. NSFC-Guangdong Joint Funds [U1701263]
  7. Natural Science Foundation of Tianjin [18ZXZNGX00040]
  8. National Key R&D Programof China [2018YFB1004700]
  9. Science Innovation Foundation of Dalian [2019J12GX037]

向作者/读者索取更多资源

This article focuses on the cost-throughput tradeoff problem in geo-distributed data analytics, aiming to minimize inter-DC traffic cost and maximize system throughput. By formulating a stochastic optimization problem and designing an online control framework, the proposed method achieves near-optimal solutions and maintains system stability and robustness.
In the era of global-scale services, analytical queries are performed on datasets that span multiple data centers (DCs). Such geo-distributed queries generate a large amount of inter-DC data transfers at run time. Due to the expensive inter-DC bandwidth, various methods have been proposed to reduce the traffic cost in geo-distributed data analytics. However, current methods do not attempt to address the throughput issue in geo-distributed analytics. In this article, we target at characterizing and optimizing a cost-throughput tradeoff problem in geo-distributed data analytics. Our objectives are two-fold: (1) we minimize the inter-DC traffic cost when serving geo-distributed analytics with uncertain query demand, and (2) we maximize the system throughput, in terms of the number of query requests that can be successfully served with guaranteed queuing delay. Specifically, we formulate a stochastic optimization problem that seamlessly combines these two objectives. To solve this problem, we take advantage of Lyapunov optimization techniques to design and analyze a two-timescale online control framework. Without prior knowledge of future query requests, this framework makes online decisions on input data placement and admission control of query requests. Rigorous theoretical analyses show that our framework can achieve a near-optimal solution and maintain system stability and robustness as well. Extensive trace-driven simulation results further demonstrate that our framework is capable of reducing inter-DC traffic cost, improving system throughput, and guaranteeing a maximum delay for each query request.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据