☆ 4.6 Article

Improved training of deep convolutional networks via minimum-variance regularized adaptive sampling

SOFT COMPUTING (2023)

期刊

SOFT COMPUTING

卷 27, 期 18, 页码 13237-13253

出版社

SPRINGER

DOI: 10.1007/s00500-022-07131-7

关键词

Deep learning; Convolutional neural networks; Gradient descent; Importance sampling

类别

Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

By introducing an adaptive sampling method based on importance sampling (IS), the training of deep neural networks (DNNs) is improved. Experimental results show that this method improves both speed and variance without significant impact on classification.

Fostered by technological and theoretical developments, deep neural networks (DNNs) have achieved great success in many applications, but their training via mini-batch stochastic gradient descent (SGD) can be very costly due to the possibly tens of millions of parameters to be optimized and the large amounts of training examples that must be processed. The computational cost is exacerbated by the inefficiency of the uniform sampling typically used by SGD to form the training mini-batches: since not all training examples are equally relevant for training, sampling these under a uniform distribution is far from optimal, making the case for the study of improved methods to train DNNs. A better strategy is to sample the training instances under a distribution where the probability of being selected is proportional to the relevance of each individual instance; one way to achieve this is through importance sampling (IS), which minimizes the gradients' variance w.r.t. the network parameters, consequently improving convergence. In this paper, an IS-based adaptive sampling method to improve the training of DNNs is introduced. This method exploits side information to construct the optimal sampling distribution and is dubbed regularized adaptive sampling (RAS). Experimental comparison using deep convolutional networks for classification of the MNIST and CIFAR-10 datasets shows that when compared against SGD and against another sampling method in the state of the art, RAS produces improvements in the speed and variance of the training process without incurring significant overhead or affecting the classification.

Improved training of deep convolutional networks via minimum-variance regularized adaptive sampling

期刊

SOFT COMPUTING

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improved training of deep convolutional networks via minimum-variance regularized adaptive sampling

期刊

SOFT COMPUTING

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文