4.7 Article

Local minima found in the subparameter space can be effective for ensembles of deep convolutional neural networks

Journal

PATTERN RECOGNITION
Volume 109, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2020.107582

Keywords

Ensemble learning; Ensemble selection; Ensemble fusion; Deep convolutional neural network

Funding

  1. Sichuan Science and Technology Program [2020YFS0088]
  2. 1.3.5 project for disciplines of excellence Clinical Research Incubation Project, West China Hospital, Sichuan University [2019HXFH036]
  3. National Key Research and Development Program, China [2017YFC0113908]
  4. Technological Innovation Project of Chengdu New Industrial Technology Research Institute [2017-CY02-00026-GX]
  5. West China Hospital, Sichuan University [139170022]

Ask authors/readers for more resources

Ensembles of deep CNNs play a crucial role in ensemble learning for artificial intelligence applications, but the increasing complexity of deep CNN architectures and large data dimensionality have made their usage costly. A new approach is proposed to find multiple models converging to local minima in the subparameter space of deep CNNs, which can improve generalization while being more affordable during training and testing stages.
Ensembles of deep convolutional neural networks (CNNs), which integrate multiple deep CNN models to achieve better generalization for an artificial intelligence application, now play an important role in ensemble learning due to the dominant position of deep learning. However, the usage of ensembles of deep CNNs is still not adequate because the increasing complexity of deep CNN architectures and the emerging data with large dimensionality have made the training stage and testing stage of ensembles of deep CNNs inevitably expensive. To alleviate this situation, we propose a new approach that finds multiple models converging to local minima in subparameter space for ensembles of deep CNNs. The subparameter space here refers to the space constructed by a partial selection of parameters, instead of the entire set of parameters, of a deep CNN architecture. We show that local minima found in the subparameter space of a deep CNN architecture can in fact be effective for ensembles of deep CNNs to achieve better generalization. Moreover, finding local minima in the subparameter space of a deep CNN architecture is more affordable at the training stage, and the multiple models at the found local minima can also be selectively fused to achieve better ensemble generalization while limiting the expense to a single deep CNN model at the testing stage. Demonstrations of MobilenetV2, Resnet50 and InceptionV4 (deep CNN architectures from lightweight to complex) on ImageNet, CIFAR-10 and CIFAR-10 0, respectively, lead us to believe that finding local minima in the subparameter space of a deep CNN architecture could be leveraged to broaden the usage of ensembles of deep CNNs. (C) 2020 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available