585

When big data actually are low-rank, or entrywise approximation of certain function-generated matrices

Stanislav Budzinskiy
Abstract

The article concerns low-rank approximation of matrices generated by sampling a smooth function of two mm-dimensional variables. We refute an argument made in the literature to prove that, for a specific class of analytic functions, such matrices admit accurate entrywise approximation of rank that is independent of mm -- a claim known as "big-data matrices are approximately low-rank". We provide a theoretical explanation of the numerical results presented in support of this claim, describing three narrower classes of functions for which n×nn \times n function-generated matrices can be approximated within an entrywise error of order ε\varepsilon with rank O(log(n)ε2polylog(ε1))\mathcal{O}(\log(n) \varepsilon^{-2} \mathrm{polylog}(\varepsilon^{-1})) that is independent of the dimension mm: (i) functions of the inner product of the two variables, (ii) functions of the Euclidean distance between the variables, and (iii) shift-invariant positive-definite kernels. We extend our argument to tensor-train approximation of tensors generated with functions of the multi-linear product of their mm-dimensional variables. We discuss our results in the context of low-rank approximation of (a) growing datasets and (b) attention in transformer neural networks.

View on arXiv
Comments on this paper