ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.10938
  4. Cited By
Accurate KV Cache Quantization with Outlier Tokens Tracing

Accurate KV Cache Quantization with Outlier Tokens Tracing

Annual Meeting of the Association for Computational Linguistics (ACL), 2025
16 May 2025
Yi Su
Yuechi Zhou
Quantong Qiu
Jilong Li
Qingrong Xia
Ping Li
Xinyu Duan
Zhefeng Wang
Min Zhang
    MQ
ArXiv (abs)PDFHTML

Papers citing "Accurate KV Cache Quantization with Outlier Tokens Tracing"

4 / 4 papers shown
PatternKV: Flattening KV Representation Expands Quantization Headroom
PatternKV: Flattening KV Representation Expands Quantization Headroom
Ji Zhang
Yiwei Li
Shaoxiong Feng
Peiwen Yuan
Xinglin Wang
...
Y. Zhang
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
MQ
143
0
0
05 Oct 2025
Survey of Specialized Large Language Model
Survey of Specialized Large Language Model
Chenghan Yang
Ruiyu Zhao
Yang Liu
Ling Jiang
LM&MA
109
1
0
27 Aug 2025
Taming the Titans: A Survey of Efficient LLM Inference Serving
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
Junlin Li
Yixin Ji
Zhiyong Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Zehao Wang
Baoxing Huai
Hao Fei
LLMAG
413
7
0
28 Apr 2025
Fast Transformer Decoding: One Write-Head is All You Need
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
588
636
0
06 Nov 2019
1