ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.03482
  4. Cited By
QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero
  Overhead
v1v2 (latest)

QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero Overhead

5 June 2024
A. Zandieh
Majid Daliri
Insu Han
    MQ
ArXiv (abs)PDFHTMLGithub (25★)

Papers citing "QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero Overhead"

10 / 10 papers shown
Mitigating Diffusion Model Hallucinations with Dynamic Guidance
Mitigating Diffusion Model Hallucinations with Dynamic Guidance
Kostas Triaridis
Alexandros Graikos
Aggelina Chatziagapi
Grigorios G. Chrysos
Dimitris Samaras
DiffM
158
0
0
06 Oct 2025
KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache
Fei Li
Song Liu
Weiguo Wu
Shiqiang Nie
Jinyu Wang
MQ
177
1
0
18 May 2025
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
A. Zandieh
Majid Daliri
Majid Hadian
Vahab Mirrokni
MQ
769
16
0
28 Apr 2025
SQuat: Subspace-orthogonal KV Cache Quantization
SQuat: Subspace-orthogonal KV Cache Quantization
Hao Wang
Ligong Han
Kai Xu
Akash Srivastava
MQ
439
3
0
31 Mar 2025
Compression Barriers for Autoregressive Transformers
Compression Barriers for Autoregressive Transformers
Themistoklis Haris
Krzysztof Onak
200
2
0
21 Feb 2025
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Yuxiang Huang
Binhang Yuan
Xu Han
Chaojun Xiao
Zhiyuan Liu
RALM
653
12
0
02 Oct 2024
A Tighter Complexity Analysis of SparseGPT
A Tighter Complexity Analysis of SparseGPT
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
405
23
0
22 Aug 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive
  Benchmark of Long Context Capable Approaches
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
Jiayi Yuan
Hongyi Liu
Shaochen
Zhong
Yu-Neng Chuang
...
Hongye Jin
Vipin Chaudhary
Zhaozhuo Xu
Zirui Liu
Xia Hu
360
43
0
01 Jul 2024
Streaming Kernel PCA Algorithm With Small Space
Streaming Kernel PCA Algorithm With Small Space
Yichuan Deng
Zhao Song
Zifan Wang
Hangke Zhang
381
4
0
08 Mar 2023
Fast Transformer Decoding: One Write-Head is All You Need
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
834
731
0
06 Nov 2019
1
Page 1 of 1