ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.06366
  4. Cited By
N-Grammer: Augmenting Transformers with latent n-grams

N-Grammer: Augmenting Transformers with latent n-grams

13 July 2022
Aurko Roy
Rohan Anil
Guangda Lai
Benjamin Lee
Jeffrey Zhao
Shuyuan Zhang
Shibo Wang
Ye Zhang
Shen Wu
Rigel Swavely
Tao Yu
Yu
Phuong Dao
Christopher Fifty
Zhiwen Chen
Yonghui Wu
ArXiv (abs)PDFHTMLGithub (2861★)

Papers citing "N-Grammer: Augmenting Transformers with latent n-grams"

5 / 5 papers shown
Scaling Embedding Layers in Language Models
Scaling Embedding Layers in Language Models
Da Yu
Edith Cohen
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Daogao Liu
Chiyuan Zhang
581
13
0
03 Feb 2025
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman
Alexander Nikulin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
Andrei Polubarov
Igor Kiselev
Vladislav Kurenkov
OffRL
531
6
0
04 Nov 2024
Revisiting N-Gram Models: Their Impact in Modern Neural Networks for
  Handwritten Text Recognition
Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition
Solène Tarride
Christopher Kermorvant
224
2
0
30 Apr 2024
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Transformer-VQ: Linear-Time Transformers via Vector QuantizationInternational Conference on Learning Representations (ICLR), 2023
Albert Mohwald
300
30
0
28 Sep 2023
Cramming: Training a Language Model on a Single GPU in One Day
Cramming: Training a Language Model on a Single GPU in One DayInternational Conference on Machine Learning (ICML), 2022
Jonas Geiping
Tom Goldstein
MoE
415
112
0
28 Dec 2022
1
Page 1 of 1