Superposition, the ability of neural networks to represent more features than neurons, is increasingly seen as key to the efficiency of large models. This paper investigates the theoretical foundations of computing in superposition, establishing complexity bounds for explicit, provably correct algorithms.We present the first lower bounds for a neural network computing in superposition, showing that for a broad class of problems, including permutations and pairwise logical operations, computing features in superposition requires at least neurons and parameters. This implies the first subexponential upper bound on superposition capacity: a network with neurons can compute at most features. Conversely, we provide a nearly tight constructive upper bound: logical operations like pairwise AND can be computed using neurons and parameters. There is thus an exponential gap between the complexity of computing in superposition (the subject of this work) versus merely representing features, which can require as little as neurons based on the Johnson-Lindenstrauss Lemma.Our hope is that our results open a path for using complexity theoretic techniques in neural network interpretability research.
View on arXiv@article{adler2025_2409.15318, title={ On the Complexity of Neural Computation in Superposition }, author={ Micah Adler and Nir Shavit }, journal={arXiv preprint arXiv:2409.15318}, year={ 2025 } }