4.7 Article

Efficient modeling of trivializing maps for lattice φ4 theory using normalizing flows: A first look at scalability

Journal

PHYSICAL REVIEW D
Volume 104, Issue 9, Pages -

Publisher

AMER PHYSICAL SOC
DOI: 10.1103/PhysRevD.104.094507

Keywords

-

Funding

  1. Science and Technology Facilities Council (STFC) Consolidated Grant [ST/P0000630/1]
  2. Royal Society Wolfson Research Merit Award [WM140078]
  3. STFC [ST/R504737/1]
  4. Edinburgh Compute and Data Facility (ECDF) [113]

Ask authors/readers for more resources

General-purpose Markov chain Monte Carlo sampling algorithms face efficiency reduction near critical points, but normalizing flows in conjunction with machine learning techniques offer a promising approach to overcome this issue. By utilizing trivializing maps and field transformations, the approach provides a more efficient sampling strategy that decouples statistical efficiency from system correlation length. Additional research is needed to understand and address the increased training costs as systems approach the continuum limit.
General-purpose Markov chain Monte Carlo sampling algorithms suffer from a dramatic reduction in efficiency as the system being studied is driven toward a critical point through, for example, taking the continuum limit. Recently, a series of seminal studies suggested that normalizing flows-a class of deep generative models-can form the basis of a sampling strategy that does not suffer from this critical slowing down. The central idea is to use machine learning techniques to build (approximate) trivializing maps, i.e., field transformations that map the theory of interest into a simpler theory in which the degrees of freedom decouple. These trivializing maps provide a representation of the theory in which all its nontrivial aspects are encoded within an invertible transformation to a set of field variables whose statistical weight in the path integral is given by a distribution from which sampling is easy. No separate process is required to generate training data for such models, and convergence to the desired distribution is guaranteed through a reweighting procedure such as a Metropolis test. From a theoretical perspective, this approach has the potential to become more efficient than traditional sampling since the statistical efficiency of the sampling algorithm is decoupled from the correlation length of the system. The caveat to all of this is that, somehow, the costs associated with the highly nontrivial task of sampling from the path integral of an interacting field theory are transferred to the training of a model to perform this transformation. In a proof-of-principle demonstration on two-dimensional phi(4) theory, Albergo, Kanwar, and Shanahan [Phys. Rev. D 100, 034515 (2019)] modeled the trivializing map as a sequence of pointwise affine transformations. We pick up this thread, with the aim of quantifying how well we can expect this approach to scale as we increase the number of degrees of freedom in the system. We make several modifications to the original design that allow our models to learn more efficient representations of trivializing maps using much smaller neural networks, which leads to a large reduction in the computational cost required to train models of equivalent quality. After making these changes, we find that sampling efficiency is almost entirely dictated by how extensively a model has been trained while being unresponsive to further alterations that increase model flexibility. However, as we move toward the continuum limit the training costs scale extremely quickly, which urgently requires further work to fully understand and mitigate.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available