ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.17276
  4. Cited By
OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure

25 June 2024
Jikai Wang
Yi Su
Juntao Li
Qingrong Xia
Zi Ye
Xinyu Duan
Zhefeng Wang
Min Zhang
ArXivPDFHTML

Papers citing "OPT-Tree: Speculative Decoding with Adaptive Draft Tree Structure"

10 / 10 papers shown
Title
Taming the Titans: A Survey of Efficient LLM Inference Serving
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
J. Li
Yixin Ji
Z. Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Z. Wang
Baoxing Huai
M. Zhang
LLMAG
77
0
0
28 Apr 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
82
0
0
27 Feb 2025
CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter
CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter
Yepeng Weng
Dianwen Mei
Huishi Qiu
Xujie Chen
Li Liu
Jiang Tian
Zhongchao Shi
38
0
0
24 Feb 2025
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding
Zhaoxuan Wu
Zijian Zhou
Arun Verma
Alok Prakash
Daniela Rus
Bryan Kian Hsiang Low
55
0
0
24 Feb 2025
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
Feiye Huo
Jianchao Tan
K. Zhang
Xunliang Cai
Shengli Sun
33
0
0
20 Feb 2025
Closer Look at Efficient Inference Methods: A Survey of Speculative
  Decoding
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding
Hyun Ryu
Eric Kim
64
3
0
20 Nov 2024
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Artem Basharin
Andrei Chertkov
Ivan V. Oseledets
26
1
0
23 Oct 2024
QSpec: Speculative Decoding with Complementary Quantization Schemes
QSpec: Speculative Decoding with Complementary Quantization Schemes
Juntao Zhao
Wenhao Lu
Sheng Wang
Lingpeng Kong
Chuan Wu
MQ
42
5
0
15 Oct 2024
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao
Hongming Zhang
Tao Ge
Siru Ouyang
Vicente Ordonez
Dong Yu
28
5
0
08 Oct 2024
Break the Sequential Dependency of LLM Inference Using Lookahead
  Decoding
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
118
134
0
03 Feb 2024
1