Communities
Connect sessions
AI calendar
Organizations
Contact Sales
Search
Open menu
Home
Papers
All Papers
Title
Home
Papers
2505.23416
Cited By
v1
v2 (latest)
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
29 May 2025
Jang-Hyun Kim
Jinuk Kim
S. Kwon
Jae W. Lee
Sangdoo Yun
Hyun Oh Song
MQ
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (11 upvotes)
Github (111★)
Papers citing
"KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction"
10 / 10 papers shown
Title
ThinKV: Thought-Adaptive KV Cache Compression for Efficient Reasoning Models
Akshat Ramachandran
Marina Neseem
Charbel Sakr
Rangharajan Venkatesan
Brucek Khailany
Tushar Krishna
MQ
LRM
VLM
21
0
1
01 Oct 2025
EpiCache: Episodic KV Cache Management for Long Conversational Question Answering
Minsoo Kim
Arnav Kundu
Han-Byul Kim
Richa Dixit
Minsik Cho
32
0
0
22 Sep 2025
KVCompose: Efficient Structured KV Cache Compression with Composite Tokens
Dmitry Akulov
Mohamed Sana
A. De Domenico
Tareq Si Salem
Nicola Piovesan
Fadhel Ayed
MQ
48
0
0
05 Sep 2025
StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
Yanlai Yang
Zhuokai Zhao
Satya Narayan Shukla
Aashu Singh
Shlok Kumar Mishra
Lizhu Zhang
Mengye Ren
VLM
38
1
0
21 Aug 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
292
408
0
25 Mar 2025
Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
Yuan Feng
Junlin Lv
Yukun Cao
Xike Xie
S. K. Zhou
VLM
276
65
0
28 Jan 2025
Qwen2.5-1M Technical Report
An Yang
Bowen Yu
Chong Li
Dayiheng Liu
Fei Huang
...
Xingzhang Ren
Xinlong Yang
You Li
Zhiying Xu
Zizhuo Zhang
200
63
0
28 Jan 2025
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu
Meng Chen
Baotong Lu
Huiqiang Jiang
Zhenhua Han
...
Jianchao Tan
Chong Chen
Fan Yang
Yue Yang
Lili Qiu
275
63
0
03 Jan 2025
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai
Yichi Zhang
Bofei Gao
Yuliang Liu
Yongqian Li
...
Wayne Xiong
Yue Dong
Baobao Chang
Junjie Hu
Wen Xiao
393
144
0
04 Jun 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Chengyue Wu
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
259
130
0
07 May 2024
1