4.6 Article

Pre-training the deep generative models with adaptive hyperparameter optimization

Journal

NEUROCOMPUTING
Volume 247, Issue -, Pages 144-155

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2017.03.058

Keywords

Deep generative model; Hyperparameter optimization; Sequential model-based optimization; Contrastive divergence

Funding

  1. National Basic Research Program of China (973 Program) [2013CB336500]
  2. Chinese National 863 Program of Demonstration of Digital Medical Service and Technology in Destined Region [2012-AA02A614]
  3. National Youth Top-notch Talent Support Program

Ask authors/readers for more resources

The performance of many machine learning algorithms depends crucially on the hyperparameter settings, especially in Deep Learning. Manually tuning the hyperparameters is laborious and time consuming. To address this issue, Bayesian optimization (BO) methods and their extensions have been proposed to optimize the hyperparameters automatically. However, they still suffer from highly computational expense when applying to deep generative models (DGMs) due to their strategy of the black-box function optimization. This paper provides a new hyperparameter optimization procedure at the pre-training phase of the DGMs, where we avoid combining all layers as one black-box function by taking advantage of the layer-by-layer learning strategy. Following this procedure, we are able to optimize multiple hyperparameters in an adaptive way by using Gaussian process. In contrast to the traditional BO methods, which mainly focus on the supervised models, the pre-training procedure is unsupervised where there is no validation error can be used. To alleviate this problem, this paper proposes a new holdout loss, the free energy gap, which takes into account both factors of the model fitting and over-fitting. The empirical evaluations demonstrate that our method not only speeds up the process of hyperparameter optimization, but also improves the performances of DGMs significantly in both the supervised and unsupervised learning tasks. (C) 2017 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available