Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.08707
Cited By
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
14 May 2024
Xueyan Niu
Bo Bai
Lei Deng
Wei Han
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory"
8 / 8 papers shown
Title
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
WeiZhi Fei
Xueyan Niu
Guoqing Xie
Yingqing Liu
Bo Bai
Wei Han
28
1
0
22 Jan 2025
A Theoretical Survey on Foundation Models
Shi Fu
Yuzhu Chen
Yingjie Wang
Dacheng Tao
18
0
0
15 Oct 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
39
14
0
08 Jul 2024
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
79
101
0
10 Apr 2024
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
91
40
0
13 Mar 2024
Word Acquisition in Neural Language Models
Tyler A. Chang
Benjamin Bergen
27
29
0
05 Oct 2021
Hierarchical Associative Memory
Dmitry Krotov
BDL
89
22
0
14 Jul 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1