SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths

SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths

30 May 2024

Kaixuan Huang

Mengdi Wang

Papers citing "SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths"

6 / 6 papers shown

Title
Mixture of Attentions For Speculative Decoding Matthieu Zimmer Milan Gritta Gerasimos Lampouras Haitham Bou Ammar Jun Wang 63 4 0 04 Oct 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models Jinliang Lu Ziliang Pang Min Xiao Yaochen Zhu Rui Xia Jiajun Zhang MoMe 19 17 0 08 Jul 2024
SnapKV: LLM Knows What You are Looking for Before Generation Yuhong Li Yingbing Huang Bowen Yang Bharat Venkitesh Acyr F. Locatelli Hanchen Ye Tianle Cai Patrick Lewis Deming Chen VLM 73 148 0 22 Apr 2024
Speculative Streaming: Fast LLM Inference without Auxiliary Models Nikhil Bhendawade Irina Belousova Qichen Fu Henry Mason Mohammad Rastegari Mahyar Najibi LRM 24 27 0 16 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding Yichao Fu Peter Bailis Ion Stoica Hao Zhang 120 134 0 03 Feb 2024
Diffusion-LM Improves Controllable Text Generation Xiang Lisa Li John Thickstun Ishaan Gulrajani Percy Liang Tatsunori B. Hashimoto AI4CE 163 768 0 27 May 2022