ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.03001
19
6

Can Transformers Learn nnn-gram Language Models?

3 October 2024
Anej Svete
Nadav Borenstein
M. Zhou
Isabelle Augenstein
Ryan Cotterell
ArXivPDFHTML
Abstract

Much theoretical work has described the ability of transformers to represent formal languages. However, linking theoretical results to empirical performance is not straightforward due to the complex interplay between the architecture, the learning algorithm, and training data. To test whether theoretical lower bounds imply \emph{learnability} of formal languages, we turn to recent work relating transformers to nnn-gram language models (LMs). We study transformers' ability to learn random nnn-gram LMs of two kinds: ones with arbitrary next-symbol probabilities and ones where those are defined with shared parameters. We find that classic estimation techniques for nnn-gram LMs such as add-λ\lambdaλ smoothing outperform transformers on the former, while transformers perform better on the latter, outperforming methods specifically designed to learn nnn-gram LMs.

View on arXiv
Comments on this paper