Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.10938
Cited By
Accurate KV Cache Quantization with Outlier Tokens Tracing
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
16 May 2025
Yi Su
Yuechi Zhou
Quantong Qiu
Jilong Li
Qingrong Xia
Ping Li
Xinyu Duan
Zhefeng Wang
Min Zhang
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Accurate KV Cache Quantization with Outlier Tokens Tracing"
4 / 4 papers shown
PatternKV: Flattening KV Representation Expands Quantization Headroom
Ji Zhang
Yiwei Li
Shaoxiong Feng
Peiwen Yuan
Xinglin Wang
...
Y. Zhang
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
MQ
143
0
0
05 Oct 2025
Survey of Specialized Large Language Model
Chenghan Yang
Ruiyu Zhao
Yang Liu
Ling Jiang
LM&MA
109
1
0
27 Aug 2025
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
Junlin Li
Yixin Ji
Zhiyong Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Zehao Wang
Baoxing Huai
Hao Fei
LLMAG
413
7
0
28 Apr 2025
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
588
636
0
06 Nov 2019
1