Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.03994
Cited By
Dispatcher: A Message-Passing Approach To Language Modelling
9 May 2021
A. Cetoli
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dispatcher: A Message-Passing Approach To Language Modelling"
3 / 3 papers shown
Title
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
262
2,013
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
240
579
0
12 Mar 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
1