4.6 Article

Adversarially Training MCMC with Non-Volume-Preserving Flows

Journal

ENTROPY
Volume 24, Issue 3, Pages -

Publisher

MDPI
DOI: 10.3390/e24030415

Keywords

Hamiltonian Monte Carlo; flow models; Markov chain Monte Carlo; statistical pattern recognition; Bayesian machine learning

Funding

  1. NSFC [62076096, 62006078]
  2. Shanghai Municipal Project [20511100900]
  3. Shanghai Knowledge Service Platform Project [ZF1213]
  4. Shanghai Chenguang Program [19CG25]
  5. Open Research Fund of KLATASDS-MOE
  6. Fundamental Research Funds for the Central Universities

Ask authors/readers for more resources

Recently, the use of neural network parameterized flow models has been applied to design efficient Markov chain Monte Carlo (MCMC) transition kernels. However, the inefficient utilization of gradient information or the use of volume-preserving flows restricts their performance in sampling from multi-modal target distributions. In this paper, a novel training scheme is proposed, which divides the training process of transition kernels into exploration and training stages, allowing for full use of gradient information and the expressive power of deep neural networks. The proposed method achieves significant improvement in effective sample size and mixes quickly to the target distribution, outperforming other state-of-the-art parameterized transition kernels in various challenging distributions and real-world datasets.
Recently, flow models parameterized by neural networks have been used to design efficient Markov chain Monte Carlo (MCMC) transition kernels. However, inefficient utilization of gradient information of the target distribution or the use of volume-preserving flows limits their performance in sampling from multi-modal target distributions. In this paper, we treat the training procedure of the parameterized transition kernels in a different manner and exploit a novel scheme to train MCMC transition kernels. We divide the training process of transition kernels into the exploration stage and training stage, which can make full use of the gradient information of the target distribution and the expressive power of deep neural networks. The transition kernels are constructed with non-volume-preserving flows and trained in an adversarial form. The proposed method achieves significant improvement in effective sample size and mixes quickly to the target distribution. Empirical results validate that the proposed method is able to achieve low autocorrelation of samples and fast convergence rates, and outperforms other state-of-the-art parameterized transition kernels in varieties of challenging analytically described distributions and real world datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available