PEARL: Parallel Speculative Decoding with Adaptive Draft Length

13 August 2024

Papers citing "PEARL: Parallel Speculative Decoding with Adaptive Draft Length"

8 / 8 papers shown

Title
Token-Driven GammaTune: Adaptive Calibration for Enhanced Speculative Decoding Aayush Gautam Susav Shrestha Narasimha Annapareddy 36 0 0 28 Mar 2025
Exploiting Edited Large Language Models as General Scientific Optimizers Qitan Lv T. Liu H. Wang 33 0 0 08 Mar 2025
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting Kai Lv Honglin Guo Qipeng Guo Xipeng Qiu 34 0 0 02 Mar 2025
Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff Maximilian Holsman Yukun Huang Bhuwan Dhingra 26 0 0 28 Feb 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques Y. Hu Zining Liu Zhenyuan Dong Tianfan Peng Bradley McDanel S. Zhang 82 0 0 27 Feb 2025
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding Zhaoxuan Wu Zijian Zhou Arun Verma Alok Prakash Daniela Rus Bryan Kian Hsiang Low 55 0 0 24 Feb 2025
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding Zilin Xiao Hongming Zhang Tao Ge Siru Ouyang Vicente Ordonez Dong Yu 28 5 0 08 Oct 2024
Dynamic Depth Decoding: Faster Speculative Decoding for LLMs Oscar Brown Zhengjie Wang Andrea Do Nikhil Mathew Cheng Yu 16 3 0 30 Aug 2024