4.6 Article

Adversarially Training MCMC with Non-Volume-Preserving Flows

期刊

ENTROPY
卷 24, 期 3, 页码 -

出版社

MDPI
DOI: 10.3390/e24030415

关键词

Hamiltonian Monte Carlo; flow models; Markov chain Monte Carlo; statistical pattern recognition; Bayesian machine learning

资金

  1. NSFC [62076096, 62006078]
  2. Shanghai Municipal Project [20511100900]
  3. Shanghai Knowledge Service Platform Project [ZF1213]
  4. Shanghai Chenguang Program [19CG25]
  5. Open Research Fund of KLATASDS-MOE
  6. Fundamental Research Funds for the Central Universities

向作者/读者索取更多资源

Recently, the use of neural network parameterized flow models has been applied to design efficient Markov chain Monte Carlo (MCMC) transition kernels. However, the inefficient utilization of gradient information or the use of volume-preserving flows restricts their performance in sampling from multi-modal target distributions. In this paper, a novel training scheme is proposed, which divides the training process of transition kernels into exploration and training stages, allowing for full use of gradient information and the expressive power of deep neural networks. The proposed method achieves significant improvement in effective sample size and mixes quickly to the target distribution, outperforming other state-of-the-art parameterized transition kernels in various challenging distributions and real-world datasets.
Recently, flow models parameterized by neural networks have been used to design efficient Markov chain Monte Carlo (MCMC) transition kernels. However, inefficient utilization of gradient information of the target distribution or the use of volume-preserving flows limits their performance in sampling from multi-modal target distributions. In this paper, we treat the training procedure of the parameterized transition kernels in a different manner and exploit a novel scheme to train MCMC transition kernels. We divide the training process of transition kernels into the exploration stage and training stage, which can make full use of the gradient information of the target distribution and the expressive power of deep neural networks. The transition kernels are constructed with non-volume-preserving flows and trained in an adversarial form. The proposed method achieves significant improvement in effective sample size and mixes quickly to the target distribution. Empirical results validate that the proposed method is able to achieve low autocorrelation of samples and fast convergence rates, and outperforms other state-of-the-art parameterized transition kernels in varieties of challenging analytically described distributions and real world datasets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据