110

Bound by semanticity: universal laws governing the generalization-identification tradeoff

Main:9 Pages
15 Figures
Bibliography:3 Pages
Appendix:16 Pages
Abstract

Intelligent systems must deploy internal representations that are simultaneously structured -- to support broad generalization -- and selective -- to preserve input identity. We expose a fundamental limit on this tradeoff. For any model whose representational similarity between inputs decays with finite semantic resolution ε\varepsilon, we derive closed-form expressions that pin its probability of correct generalization pSp_S and identification pIp_I to a universal Pareto front independent of input space geometry. Extending the analysis to noisy, heterogeneous spaces and to n>2n>2 inputs predicts a sharp 1/n1/n collapse of multi-input processing capacity and a non-monotonic optimum for pSp_S. A minimal ReLU network trained end-to-end reproduces these laws: during learning a resolution boundary self-organizes and empirical (pS,pI)(p_S,p_I) trajectories closely follow theoretical curves for linearly decaying similarity. Finally, we demonstrate that the same limits persist in two markedly more complex settings -- a convolutional neural network and state-of-the-art vision-language models -- confirming that finite-resolution similarity is a fundamental emergent informational constraint, not merely a toy-model artifact. Together, these results provide an exact theory of the generalization-identification trade-off and clarify how semantic resolution shapes the representational capacity of deep networks and brains alike.

View on arXiv
Comments on this paper