☆ 4.7 Article

APapo: An asynchronous parallel optimization method for DNN models

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2024)

期刊

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

卷 152, 期 -, 页码 317-330

出版社

ELSEVIER

DOI: 10.1016/j.future.2023.11.004

关键词

DNN model parallelism; Model segmentation; Asynchronous pipeline parallelism; Augmented antichain; Computation-communication overlap

类别

Computer Science, Theory & Methods

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes an asynchronous parallel optimization method APapo to address challenges in parallel optimization of large-scale DNN models. The method achieves fine-grained task segmentation, maximizes computing resource utilization, and improves training speed while maintaining accuracy.

To address the challenges related to segmentation complexity, high memory usage, extended training duration, and low equipment utilization in parallel optimization of large-scale deep neural network (DNN) models, this paper proposes an asynchronous parallel optimization method APapo. Firstly, a multi-iteration asynchronous pipeline parallel scheduling was established for model parallel computing tasks, controlling the specific scheduling process of micro-batch units to address gradient delay updating during asynchronous iteration. Secondly, combined with the given network model and hardware configuration, a dynamic programming strategy for computing resources and model tasks was designed to achieve dynamic segmentation of model computing tasks and optimal matching of computing resources. Finally, an optimization strategy for runtime scheduling of computing resources and model tasks was developed, using improved device streams to maximize the overlap between computing and communication, thus improving the utilization rate of computing resources and reducing training time. Experimental results show that the APapo method achieves fine-grained task segmentation, maximizes the utilization rate of each GPU computing resource, and on average improves the training speed of large-scale deep neural network models by 2.8 times while maintaining the training accuracy of the model compared to existing parallel optimization methods.

APapo: An asynchronous parallel optimization method for DNN models

期刊

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

APapo: An asynchronous parallel optimization method for DNN models

期刊

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文