ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.10424
  4. Cited By
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache

5 February 2025
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
    MQ
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache"

3 / 53 papers shown
Compressive Transformers for Long-Range Sequence Modelling
Compressive Transformers for Long-Range Sequence ModellingInternational Conference on Learning Representations (ICLR), 2019
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALMVLMKELM
305
778
0
13 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerJournal of machine learning research (JMLR), 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
1.6K
23,949
0
23 Oct 2019
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
1.1K
3,525
0
26 Sep 2016
Previous
12
Page 2 of 2