Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.17276
Cited By
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure
25 June 2024
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure"
10 / 10 papers shown
Title
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
J. Li
Yixin Ji
Z. Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Z. Wang
Baoxing Huai
M. Zhang
LLMAG
77
0
0
28 Apr 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
82
0
0
27 Feb 2025
CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter
Yepeng Weng
Dianwen Mei
Huishi Qiu
Xujie Chen
Li Liu
Jiang Tian
Zhongchao Shi
38
0
0
24 Feb 2025
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding
Zhaoxuan Wu
Zijian Zhou
Arun Verma
Alok Prakash
Daniela Rus
Bryan Kian Hsiang Low
53
0
0
24 Feb 2025
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
Feiye Huo
Jianchao Tan
K. Zhang
Xunliang Cai
Shengli Sun
31
0
0
20 Feb 2025
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding
Hyun Ryu
Eric Kim
64
3
0
20 Nov 2024
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Artem Basharin
Andrei Chertkov
Ivan V. Oseledets
26
1
0
23 Oct 2024
QSpec: Speculative Decoding with Complementary Quantization Schemes
Juntao Zhao
Wenhao Lu
Sheng Wang
Lingpeng Kong
Chuan Wu
MQ
42
5
0
15 Oct 2024
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao
Hongming Zhang
Tao Ge
Siru Ouyang
Vicente Ordonez
Dong Yu
28
5
0
08 Oct 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
115
134
0
03 Feb 2024
1