4.7 Article

On Expressivity and Trainability of Quadratic Networks

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2023.3331380

关键词

Expressivity; neuronal diversity; quadratic networks; quadratic neurons; training strategy

向作者/读者索取更多资源

Inspired by the diversity of biological neurons, this paper explores the application of quadratic artificial neurons in deep learning and highlights the differences in expressivity and training risk between traditional neurons and quadratic networks with or without quadratic activation. By applying spline theory and algebraic geometry, the superior model expressivity of quadratic networks over traditional networks is mathematically demonstrated, and an effective training strategy called ReLinear is proposed to stabilize the training process of quadratic networks.
Inspired by the diversity of biological neurons, quadratic artificial neurons can play an important role in deep learning models. The type of quadratic neurons of our interest replaces the inner-product operation in the conventional neuron with a quadratic function. Despite promising results so far achieved by networks of quadratic neurons, there are important issues not well addressed. Theoretically, the superior expressivity of a quadratic network over either a conventional network or a conventional network via quadratic activation is not fully elucidated, which makes the use of quadratic networks not well grounded. In practice, although a quadratic network can be trained via generic backpropagation, it can be subject to a higher risk of collapse than the conventional counterpart. To address these issues, we first apply the spline theory and a measure from algebraic geometry to give two theorems that demonstrate better model expressivity of a quadratic network than the conventional counterpart with or without quadratic activation. Then, we propose an effective training strategy referred to as referenced linear initialization (ReLinear) to stabilize the training process of a quadratic network, thereby unleashing the full potential in its associated machine learning tasks. Comprehensive experiments on popular datasets are performed to support our findings and confirm the performance of quadratic deep learning.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据