56

Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

Main:18 Pages
8 Figures
Bibliography:3 Pages
Abstract

We study whether a Large Language Model can learn the deterministic sequence of trees generated by the iterated prime factorization of the natural numbers. Each integer is mapped into a rooted planar tree and the resulting sequence $ \mathbb{N}\mathcal{T}$ defines an arithmetic text with measurable statistical structure. A transformer network (the GPT-2 architecture) is trained from scratch on the first 101110^{11} elements to subsequently test its predictive ability under next-word and masked-word prediction tasks. Our results show that the model partially learns the internal grammar of NT\mathbb{N}\mathcal{T}, capturing non-trivial regularities and correlations. This suggests that learnability may extend beyond empirical data to the very structure of arithmetic.

View on arXiv
Comments on this paper