ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.17240
18
6

The Role of nnn-gram Smoothing in the Age of Neural Networks

25 March 2024
Luca Malagutti
Andrius Buinovskij
Anej Svete
Clara Meister
Afra Amini
Ryan Cotterell
ArXivPDFHTML
Abstract

For nearly three decades, language models derived from the nnn-gram assumption held the state of the art on the task. The key to their success lay in the application of various smoothing techniques that served to combat overfitting. However, when neural language models toppled nnn-gram models as the best performers, nnn-gram smoothing techniques became less relevant. Indeed, it would hardly be an understatement to suggest that the line of inquiry into nnn-gram smoothing techniques became dormant. This paper re-opens the role classical nnn-gram smoothing techniques may play in the age of neural language models. First, we draw a formal equivalence between label smoothing, a popular regularization technique for neural language models, and add-λ\lambdaλ smoothing. Second, we derive a generalized framework for converting any nnn-gram smoothing technique into a regularizer compatible with neural language models. Our empirical results find that our novel regularizers are comparable to and, indeed, sometimes outperform label smoothing on language modeling and machine translation.

View on arXiv
Comments on this paper