4.6 Article

Homogeneous Vector Capsules Enable Adaptive Gradient Descent in Convolutional Neural Networks

期刊

IEEE ACCESS
卷 9, 期 -, 页码 48519-48530

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3066842

关键词

Routing; Mathematical model; Computer architecture; Training; Convolutional neural networks; Neurons; Adaptive systems; Adaptive gradient descent; capsule; convolutional neural network (CNN); homogeneous vector capsules (HVCs); Inception

向作者/读者索取更多资源

The article introduces a new method of parameterizing and training capsules, called homogeneous vector capsules (HVCs), and finds that modifying a convolutional neural network to use HVCs can improve classification accuracy. The introduction of HVCs enables the use of adaptive gradient descent, reducing the model's dependence on non-adaptive optimizers.
Neural networks traditionally produce a scalar value for an activated neuron. Capsules, on the other hand, produce a vector of values, which has been shown to correspond to a single, composite feature wherein the values of the components of the vectors indicate properties of the feature such as transformation or contrast. We present a new way of parameterizing and training capsules that we refer to as homogeneous vector capsules (HVCs). We demonstrate, experimentally, that altering a convolutional neural network (CNN) to use HVCs can achieve superior classification accuracy without increasing the number of parameters or operations in its architecture as compared to a CNN using a single final fully connected layer. Additionally, the introduction of HVCs enables the use of adaptive gradient descent, reducing the dependence a model's achievable accuracy has on the finely tuned hyperparameters of a non-adaptive optimizer. We demonstrate our method and results using two neural network architectures. For the CNN architecture referred to as Inception v3, replacing the fully connected layers with HVCs increased the test accuracy by an average of 1.32% across all experiments conducted. For a simple monolithic CNN, we show HVCs improve test accuracy by an average of 19.16%.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据