4.7 Article

Edge AI as a Service: Configurable Model Deployment and Delay-Energy Optimization With Result Quality Constraints

期刊

IEEE TRANSACTIONS ON CLOUD COMPUTING
卷 11, 期 2, 页码 1954-1969

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCC.2022.3175725

关键词

AI as a Service; edge computing; delay-energy optimization; resource allocation; task configuration

向作者/读者索取更多资源

The breakthrough of AI techniques has accelerated their applications in various industries, including security protection, transportation, agriculture, and medical care. With the support of edge computing environments, providing AIaaS with latency guarantee can speed up the deployment of data-intensive and computation-intensive AI applications and reduce customers' investment cost. However, existing studies have not addressed the specific deployment architecture, working mechanism design, and performance optimization problems for AIaaS with configurable data quality and model complexity. To tackle this, we propose a configurable model deployment architecture (CMDA) for edge AIaaS and a flexible working mechanism that allows joint configuration of data quality ratios (DQRs) and model complexity ratios (MCRs) for AI tasks.
The breakthrough of artificial intelligence (AI) techniques has accelerated their applications in a wide range of industries, such as security protection, transportation, agriculture, andmedical care. With the support of edge computing environments, providing latency guaranteed AI as a Service (AIaaS) can accelerate the deployment of data-intensive and computation-intensive AI applications and reduce the investment cost of the customers. However, the deployment architecture and workingmechanism design, and performance optimization problems specific for AIaaS with configurable data quality andmodel complexity have not been studied in existingworks. To address the problem, we propose a configurablemodel deployment architecture (CMDA) for edge AIaaS and present a flexibleworkingmechanismby enabling the joint configuration of data quality ratios (DQRs) andmodel complexity ratios (MCRs) for the AI tasks. Along with commonly used resource allocation operations, themanager can improve the energy and delay performance of AI serviceswith the desired quality of results (QoRs). We develop an energy-delayminimization problemunder the framework ofCMDA and propose a polynomial regression based relaxingmethod to solve the task configuration subproblem. We conduct experiments and simulations on the ImageNet classification and the common objects in context (COCO) object detection tasks using state-of-the-art deep learningmodels. We present the corresponding result quality tables (RQTs) and QoR regressionmodels to illustrate the proposedmethod. The results of single task configuration andmulti-task configuration and resource allocation on ImageNet classification and COCOobject detection tasks demonstrate that the proposedmethod can achieve over 5x HDEC improvement compared with non-optimization schemes, and also show that joint configuration ofDQRandMCR can achieve over 1:2x HDEC improvement compared with the methods that only configureDQRor MCR.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据