OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

25 June 2024

Juntao Li

Qingrong Xia

Zhefeng Wang

Min Zhang

Papers citing "OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure"

10 / 10 papers shown

Title
Taming the Titans: A Survey of Efficient LLM Inference Serving Ranran Zhen J. Li Yixin Ji Z. Yang Tong Liu Qingrong Xia Xinyu Duan Z. Wang Baoxing Huai M. Zhang LLMAG 77 0 0 28 Apr 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques Y. Hu Zining Liu Zhenyuan Dong Tianfan Peng Bradley McDanel S. Zhang 82 0 0 27 Feb 2025
CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter Yepeng Weng Dianwen Mei Huishi Qiu Xujie Chen Li Liu Jiang Tian Zhongchao Shi 38 0 0 24 Feb 2025
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding Zhaoxuan Wu Zijian Zhou Arun Verma Alok Prakash Daniela Rus Bryan Kian Hsiang Low 55 0 0 24 Feb 2025
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding Feiye Huo Jianchao Tan K. Zhang Xunliang Cai Shengli Sun 33 0 0 20 Feb 2025
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding Hyun Ryu Eric Kim 64 3 0 20 Nov 2024
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition Artem Basharin Andrei Chertkov Ivan V. Oseledets 26 1 0 23 Oct 2024
QSpec: Speculative Decoding with Complementary Quantization Schemes Juntao Zhao Wenhao Lu Sheng Wang Lingpeng Kong Chuan Wu MQ 42 5 0 15 Oct 2024
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding Zilin Xiao Hongming Zhang Tao Ge Siru Ouyang Vicente Ordonez Dong Yu 28 5 0 08 Oct 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding Yichao Fu Peter Bailis Ion Stoica Hao Zhang 118 134 0 03 Feb 2024