4.7 Article

Learning Operators with Coupled Attention

期刊

出版社

MICROTOME PUBL

关键词

deep learning; reproducing kernel Hilbert spaces; wavelet scattering networks; functional data analysis; universal approximation

资金

  1. US Department of Energy under under the Advanced Scientific Computing Research program [DE-SC0019116]
  2. US Air Force [AFOSR FA9550-20-1-0060]
  3. US Department of Energy/Advanced Research Projects Agency [DE-AR0001201]
  4. AFOSR [FA9550-19-1-0265]
  5. NSF Simmons Mathematical and Scientific Foundations of Deep Learning [2031985]

向作者/读者索取更多资源

In this paper, a novel operator learning method called LOCA is proposed, which can approximate nonlinear operators even with a small number of output function measurements in the training set by coupling attention weights and integral transforms. Empirical results show that LOCA achieves state-of-the-art accuracy and robustness in various operator learning scenarios.
Supervised operator learning is an emerging machine learning paradigm with applications to modeling the evolution of spatio-temporal dynamical systems and approximating general black-box relationships between functional data. We propose a novel operator learning method, LOCA (Learning Operators with Coupled Attention), motivated from the recent success of the attention mechanism. In our architecture, the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations. By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions, enabling us to approximate nonlinear operators even when the number of output function measurements in the training set is very small. Our formulation is accompanied by rigorous approximation theoretic guarantees on the universal expressiveness of the proposed model. Empirically, we evaluate the performance of LOCA on several operator learning scenarios involving systems governed by ordinary and partial differential equations, as well as a black-box climate prediction problem. Through these scenarios we demonstrate state of the art accuracy, robustness with respect to noisy input data, and a consistently small spread of errors over testing data sets, even for out-of-distribution prediction tasks.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据