4.7 Article

Self-Growing Binary Activation Network: A Novel Deep Learning Model With Dynamic Architecture

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2022.3176027

Keywords

Training; Computer architecture; Neurons; Task analysis; Deep learning; Mathematical models; Data models; Binary activation function; function-preserving transformation; incremental learning; network compression; neural architecture search (NAS)

Funding

  1. Key Project of National Key Research and Development Project [2017YFC1703303]
  2. National Nature Science Foundation of China [62076211]

Ask authors/readers for more resources

This paper proposes a new deep learning model called SGBAN, which achieves higher performance and a more compact architecture on specific tasks. The model is constructed by progressively extending a fully connected network, making it more efficient compared to traditional neural architecture search methods. Experimental results demonstrate the effectiveness of SGBAN on various classification tasks, showing equivalent optimization ability and significant improvements in accuracy and parameter efficiency.
For a deep learning model, the network architecture is crucial as a model with inappropriate architecture often suffers from performance degradation or parameter redundancy. However, it is experiential and difficult to find the appropriate architecture for a certain application. To tackle this problem, we propose a novel deep learning model with dynamic architecture, named self-growing binary activation network (SGBAN), which can extend the design of a fully connected network (FCN) progressively, resulting in a more compact architecture with higher performance on a certain task. This constructing process is more efficient than neural architecture search methods that train mass of networks to search for the optimal one. Concretely, the training technique of SGBAN is based on the function-preserving transformations that can expand the architecture and combine the information in the new data without neglecting the knowledge learned in the previous steps. The experimental results on four different classification tasks, i.e., Iris, MNIST, CIFAR-10, and CIFAR-100, demonstrate the effectiveness of SGBAN. On the one hand, SGBAN achieves competitive accuracy when compared with the FCN composed of the same architecture, which indicates that the new training technique has the equivalent optimization ability as the traditional optimization methods. On the other hand, the architecture generated by SGBAN achieves 0.59% improvements of accuracy, with only 33.44% parameters when compared with the FCNs composed of manual design architectures, i.e., 500 + 150 hidden units, on MNIST. Furthermore, we demonstrate that replacing the fully connected layers of the well-trained VGG-19 with SGBAN can gain a slightly improved performance with less than 1% parameters on all these tasks. Finally, we show that the proposed method can conduct the incremental learning tasks and outperform the three outstanding incremental learning methods, i.e., learning without forgetting, elastic weight consolidation, and gradient episodic memory, on both the incremental learning tasks on Disjoint MNIST and Disjoint CIFAR-10.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available