Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.05843
Cited By
Softmax Attention with Constant Cost per Token
8 April 2024
Franz A. Heinsen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Softmax Attention with Constant Cost per Token"
3 / 3 papers shown
Title
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
95
78
0
01 Feb 2024
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
250
1,986
0
31 Dec 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
240
579
0
12 Mar 2020
1