4.7 Article

Learning Operators with Coupled Attention

Journal

JOURNAL OF MACHINE LEARNING RESEARCH
Volume 23, Issue -, Pages -

Publisher

MICROTOME PUBL

Keywords

deep learning; reproducing kernel Hilbert spaces; wavelet scattering networks; functional data analysis; universal approximation

Funding

  1. US Department of Energy under under the Advanced Scientific Computing Research program [DE-SC0019116]
  2. US Air Force [AFOSR FA9550-20-1-0060]
  3. US Department of Energy/Advanced Research Projects Agency [DE-AR0001201]
  4. AFOSR [FA9550-19-1-0265]
  5. NSF Simmons Mathematical and Scientific Foundations of Deep Learning [2031985]

Ask authors/readers for more resources

In this paper, a novel operator learning method called LOCA is proposed, which can approximate nonlinear operators even with a small number of output function measurements in the training set by coupling attention weights and integral transforms. Empirical results show that LOCA achieves state-of-the-art accuracy and robustness in various operator learning scenarios.
Supervised operator learning is an emerging machine learning paradigm with applications to modeling the evolution of spatio-temporal dynamical systems and approximating general black-box relationships between functional data. We propose a novel operator learning method, LOCA (Learning Operators with Coupled Attention), motivated from the recent success of the attention mechanism. In our architecture, the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations. By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions, enabling us to approximate nonlinear operators even when the number of output function measurements in the training set is very small. Our formulation is accompanied by rigorous approximation theoretic guarantees on the universal expressiveness of the proposed model. Empirically, we evaluate the performance of LOCA on several operator learning scenarios involving systems governed by ordinary and partial differential equations, as well as a black-box climate prediction problem. Through these scenarios we demonstrate state of the art accuracy, robustness with respect to noisy input data, and a consistently small spread of errors over testing data sets, even for out-of-distribution prediction tasks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available