☆ 4.8 Article

Reliability-Aware Online Scheduling for DNN Inference Tasks in Mobile-Edge Computing

IEEE INTERNET OF THINGS JOURNAL (2023)

期刊

IEEE INTERNET OF THINGS JOURNAL

卷 10, 期 13, 页码 11453-11464

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/JIOT.2023.3243266

关键词

Task analysis; Reliability; Internet of Things; Scheduling; Reliability engineering; Processor scheduling; Energy consumption; Approximated submodular maximization; mobile-edge computing (MEC); online learning; reliability-aware scheduling

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Mobile-edge computing is a promising technique for IoT devices with limited resources to access AI capabilities. We propose a reliability-aware online scheduling scheme that utilizes online feedback and offline data to learn the uncertain availability of edge servers, maximizing both inference accuracy and service reliability of DNN inference tasks.

Mobile-edge computing (MEC) is widely envisioned as a promising technique for provisioning artificial intelligence (AI) capability for resource-limited Internet of Things (IoT) devices by leveraging edge servers (ESs) for executing deep neural network (DNN) inference tasks in proximity. However, scheduling DNN inference tasks at the network edge under unknown system dynamics (e.g., uncertain availability of ESs) may suffer from failures, making it difficult to guarantee reliable services for the IoT device. To overcome this challenge, we propose a reliability-aware online scheduling scheme for DNN inference tasks in MEC by leveraging both online feedback and offline data to learn the uncertain availability of ESs to maximize both the inference accuracy and service reliability of DNN inference tasks (i.e., the number of DNN inference tasks processed during the system span). We first formulate the reliability-aware DNN inference tasks scheduling problem as a novel constrained combinatorial multiarmed bandit (CMAB) problem. Then by integrating the Lyapunov optimization technique, bandit learning, approximated submodular maximization, and historical data organically, we design a reliability-aware task scheduling scheme with a bandit learning (RTBL) algorithm to solve this problem. Unfortunately, even with an accurate prediction of the system uncertainties, the task scheduling problem is still NP-hard. To deal with it, we, therefore, design an advanced approximation algorithm based on the submodularity of the scheduling problem which obtains a near-optimal solution and provides a satisfactory performance guarantee. Finally, we conduct rigorous theoretical analysis and race-driven simulations to show RTBL's brilliant performance.

Reliability-Aware Online Scheduling for DNN Inference Tasks in Mobile-Edge Computing

期刊

IEEE INTERNET OF THINGS JOURNAL

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reliability-Aware Online Scheduling for DNN Inference Tasks in Mobile-Edge Computing

期刊

IEEE INTERNET OF THINGS JOURNAL

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文