113

Efficient High-Accuracy PDEs Solver with the Linear Attention Neural Operator

Main:27 Pages
8 Figures
Bibliography:4 Pages
6 Tables
Abstract

Neural operators offer a powerful data-driven framework for learning mappings between function spaces, in which the transformer-based neural operator architecture faces a fundamental scalability-accuracy trade-off: softmax attention provides excellent fidelity but incurs quadratic complexity O(N2d)\mathcal{O}(N^2 d) in the number of mesh points NN and hidden dimension dd, while linear attention variants reduce cost to O(Nd2)\mathcal{O}(N d^2) but often suffer significant accuracy degradation. To address the aforementioned challenge, in this paper, we present a novel type of neural operators, Linear Attention Neural Operator (LANO), which achieves both scalability and high accuracy by reformulating attention through an agent-based mechanism. LANO resolves this dilemma by introducing a compact set of MM agent tokens (MN)(M \ll N) that mediate global interactions among NN tokens. This agent attention mechanism yields an operator layer with linear complexity O(MNd)\mathcal{O}(MN d) while preserving the expressive power of softmax attention. Theoretically, we demonstrate the universal approximation property, thereby demonstrating improved conditioning and stability properties. Empirically, LANO surpasses current state-of-the-art neural PDE solvers, including Transolver with slice-based softmax attention, achieving average 19.5%19.5\% accuracy improvement across standard benchmarks. By bridging the gap between linear complexity and softmax-level performance, LANO establishes a scalable, high-accuracy foundation for scientific machine learning applications.

View on arXiv
Comments on this paper