Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression

21 May 2024

Papers citing "Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression"

3 / 3 papers shown

Title
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention Hong Yankun Li Xing Zhen Hui-Ling Yu Xianzhi Liu Wulong Yuan Mingxuan MQ 78 0 0 24 Feb 2025
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation Yu-Liang Zhan Zhong-Yi Lu Hao-Lun Sun Ze-Feng Gao 23 0 0 10 Nov 2024
Lossless KV Cache Compression to 2% Zhen Yang Jizong Han Kan Wu Ruobing Xie An Wang X. Sun Zhanhui Kang VLM MQ 31 1 0 20 Oct 2024