Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.15516
Cited By
Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models
22 July 2024
Georgy Tyukin
G. Dovonon
Jean Kaddour
Pasquale Minervini
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models"
3 / 3 papers shown
Title
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
Haoyi Wu
Kewei Tu
MQ
35
17
0
17 May 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
133
298
0
05 Jan 2024
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
233
626
0
21 Apr 2021
1