Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.06366
Cited By
N-Grammer: Augmenting Transformers with latent n-grams
13 July 2022
Aurko Roy
Rohan Anil
Guangda Lai
Benjamin Lee
Jeffrey Zhao
Shuyuan Zhang
Shibo Wang
Ye Zhang
Shen Wu
Rigel Swavely
Tao Yu
Yu
Phuong Dao
Christopher Fifty
Z. Chen
Yonghui Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"N-Grammer: Augmenting Transformers with latent n-grams"
7 / 7 papers shown
Title
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman
Alexander Nikulin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
Andrei Polubarov
Igor Kiselev
Vladislav Kurenkov
OffRL
44
1
0
04 Nov 2024
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
26
15
0
28 Sep 2023
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
83
152
0
17 Sep 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
257
2,013
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
579
0
12 Mar 2020
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
230
31,253
0
16 Jan 2013
1