Bound by semanticity: universal laws governing the generalization-identification tradeoff
Intelligent systems must deploy internal representations that are simultaneously structured -- to support broad generalization -- and selective -- to preserve input identity. We expose a fundamental limit on this tradeoff. For any model whose representational similarity between inputs decays with finite semantic resolution , we derive closed-form expressions that pin its probability of correct generalization and identification to a universal Pareto front independent of input space geometry. Extending the analysis to noisy, heterogeneous spaces and to inputs predicts a sharp collapse of multi-input processing capacity and a non-monotonic optimum for . A minimal ReLU network trained end-to-end reproduces these laws: during learning a resolution boundary self-organizes and empirical trajectories closely follow theoretical curves for linearly decaying similarity. Finally, we demonstrate that the same limits persist in two markedly more complex settings -- a convolutional neural network and state-of-the-art vision-language models -- confirming that finite-resolution similarity is a fundamental emergent informational constraint, not merely a toy-model artifact. Together, these results provide an exact theory of the generalization-identification trade-off and clarify how semantic resolution shapes the representational capacity of deep networks and brains alike.
View on arXiv