17
0

Kolmogorov-Arnold Networks: Approximation and Learning Guarantees for Functions and their Derivatives

Abstract

Inspired by the Kolmogorov-Arnold superposition theorem, Kolmogorov-Arnold Networks (KANs) have recently emerged as an improved backbone for most deep learning frameworks, promising more adaptivity than their multilayer perception (MLP) predecessor by allowing for trainable spline-based activation functions. In this paper, we probe the theoretical foundations of the KAN architecture by showing that it can optimally approximate any Besov function in Bp,qs(X)B^{s}_{p,q}(\mathcal{X}) on a bounded open, or even fractal, domain X\mathcal{X} in Rd\mathbb{R}^d at the optimal approximation rate with respect to any weaker Besov norm Bp,qα(X)B^{\alpha}_{p,q}(\mathcal{X}); where α<s\alpha < s. We complement our approximation guarantee with a dimension-free estimate on the sample complexity of a residual KAN model when learning a function of Besov regularity from NN i.i.d. noiseless samples. Our KAN architecture incorporates contemporary deep learning wisdom by leveraging residual/skip connections between layers.

View on arXiv
@article{kratsios2025_2504.15110,
  title={ Kolmogorov-Arnold Networks: Approximation and Learning Guarantees for Functions and their Derivatives },
  author={ Anastasis Kratsios and Takashi Furuya },
  journal={arXiv preprint arXiv:2504.15110},
  year={ 2025 }
}
Comments on this paper