Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2310.15141
Cited By
v1
v2 (latest)
SpecTr: Fast Speculative Decoding via Optimal Transport
Neural Information Processing Systems (NeurIPS), 2023
23 October 2023
Ziteng Sun
A. Suresh
Jae Hun Ro
Ahmad Beirami
Himanshu Jain
Felix X. Yu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SpecTr: Fast Speculative Decoding via Optimal Transport"
41 / 41 papers shown
Title
Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Minimization
Rahul Thomas
Arka Pal
96
0
0
19 Nov 2025
SpecDiff-2: Scaling Diffusion Drafter Alignment For Faster Speculative Decoding
Jameson Sandler
Jacob K Christopher
Thomas Hartvigsen
Ferdinando Fioretto
148
1
0
01 Nov 2025
MC-SJD : Maximal Coupling Speculative Jacobi Decoding for Autoregressive Visual Generation Acceleration
Junhyuk So
Hyunho Kook
Chaeyeon Jang
Eunhyeok Park
112
0
0
28 Oct 2025
Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs
Hongyi Liu
Jiaji Huang
Zhen Jia
Youngsuk Park
Yu Wang
OffRL
95
1
0
22 Oct 2025
3-Model Speculative Decoding
Sanghyun Byun
Mohanad Odema
Jung Guack
Baisub Lee
Jacob Song
Woo Seong Chung
LRM
72
0
0
14 Oct 2025
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs
Dachuan Shi
Abedelkadir Asi
Keying Li
Xiangchi Yuan
Leyan Pan
Wenke Lee
Wen Xiao
LRM
106
0
0
06 Oct 2025
SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification
Kanghoon Yoon
Minsub Kim
Sungjae Lee
Joonhyung Lee
Sunghyeon Woo
Yeonjun In
S. Kwon
Chanyoung Park
Dongsoo Lee
116
0
0
26 Sep 2025
Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding
Shijing Hu
Jingyang Li
Zhihui Lu
Pan Zhou
118
0
0
26 Sep 2025
SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding
Thomas Walton
Darin Tsui
Aryan Musharaf
Amirali Aghazadeh
72
0
0
25 Sep 2025
ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding
Jialiang Kang
Han Shu
Wenshuo Li
Yingjie Zhai
Xinghao Chen
MLLM
VLM
318
1
0
17 Sep 2025
Cost-Aware Contrastive Routing for LLMs
Reza Shirkavand
Shangqian Gao
Qi He
Heng-Chiao Huang
287
1
0
17 Aug 2025
READER: Retrieval-Assisted Drafter for Efficient LLM Inference
Maxim Divilkovskiy
Vitaly Malygin
Sergey Zlobin
Sultan Isali
Vasily Kalugin
Stanislav Ilyushin
Nuriza Aitassova
Yi Fei
Zeng Weidi
RALM
124
0
0
12 Aug 2025
XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding
Dian Chen
Yansong Qu
Xinyang Li
Ming Li
Shengchuan Zhang
197
2
0
31 Jul 2025
Proto-EVFL: Enhanced Vertical Federated Learning via Dual Prototype with Extremely Unaligned Data
Wei Guo
Yiyang Duan
Zhaojun Hu
Yiqi Tong
Fuzhen Zhuang
Qi. Wang
Jin Song Dong
R. Wu
Tengfei Liu
Yifan Sun
120
0
0
30 Jul 2025
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
Design Automation Conference (DAC), 2025
Linye Wei
Shuzhang Zhong
Songqiang Xu
Runsheng Wang
Ru Huang
Meng Li
206
0
0
24 Jul 2025
SPECS
\texttt{SPECS}
SPECS
: Faster Test-Time Scaling through Speculative Drafts
Mert Cemri
Nived Rajaraman
Rishabh Tiwari
Xiaoxuan Liu
Kurt Keutzer
Ion Stoica
Kannan Ramchandran
Ahmad Beirami
Ziteng Sun
LRM
187
2
0
15 Jun 2025
Gumbel-max List Sampling for Distribution Coupling with Multiple Samples
Joseph Rowan
Buu Phan
Ashish Khisti
273
0
0
05 Jun 2025
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
Xiaohao Liu
Xiaobo Xia
Weixiang Zhao
Manyi Zhang
Xianzhi Yu
Xiu Su
Shuo Yang
See-Kiong Ng
Tat-Seng Chua
KELM
LRM
360
3
0
23 May 2025
BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
Yunlong Hou
Fengzhuo Zhang
Cunxiao Du
Xuan Zhang
Jiachun Pan
Tianyu Pang
Chao Du
Vincent Y. F. Tan
Zhuoran Yang
OffRL
412
5
0
21 May 2025
SpecEdge: Scalable Edge-Assisted Serving Framework for Interactive LLMs
Jinwoo Park
Seunggeun Cho
Dongsu Han
253
2
0
16 May 2025
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
Zihao An
Huajun Bai
Ziqiang Liu
Dong Li
E. Barsoum
425
1
0
23 Apr 2025
PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2025
Junhyuk So
Jiwoong Shin
Chaeyeon Jang
Eunhyeok Park
DiffM
286
0
0
25 Mar 2025
When Speculation Spills Secrets: Side Channels via Speculative Decoding In LLMs
Jiankun Wei
Abdulrahman Abdulrazzag
Tianchen Zhang
Adel Muursepp
Gururaj Saileshwar
397
4
0
01 Nov 2024
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
Jiahao Qiu
Yifu Lu
Yifan Zeng
Jiacheng Guo
Jiayi Geng
...
Ling Yang
Mengdi Wang
Kaixuan Huang
Yue Wu
Mengdi Wang
418
49
0
18 Oct 2024
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
International Conference on Learning Representations (ICLR), 2024
Heming Xia
Yongqi Li
Jun Zhang
Cunxiao Du
Wenjie Li
LRM
321
36
0
09 Oct 2024
Efficient Inference for Large Language Model-based Generative Recommendation
International Conference on Learning Representations (ICLR), 2024
Xinyu Lin
Chaoqun Yang
Wenjie Wang
Yongqi Li
Cunxiao Du
Fuli Feng
See-Kiong Ng
Tat-Seng Chua
318
13
0
07 Oct 2024
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Yi Cheng
Xiao Liang
Yeyun Gong
Wen Xiao
Song Wang
...
Wenjie Li
Jian Jiao
Qi Chen
Peng Cheng
Wayne Xiong
HILM
481
6
0
02 Oct 2024
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
International Conference on Learning Representations (ICLR), 2024
Yao Teng
Han Shi
Xian Liu
Xuefei Ning
Guohao Dai
Yu Wang
Zhenguo Li
Xihui Liu
341
41
0
02 Oct 2024
Coupling without Communication and Drafter-Invariant Speculative Decoding
International Symposium on Information Theory (ISIT), 2024
Majid Daliri
Christopher Musco
A. Suresh
359
2
0
15 Aug 2024
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
Kaixuan Huang
Xudong Guo
M. Y. Wang
477
40
0
30 May 2024
Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models
Chen Zhang
Zhuorui Liu
Dawei Song
LRM
224
8
0
23 Apr 2024
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
408
87
0
15 Apr 2024
LLM Inference Unveiled: Survey and Roofline Model Insights
Zhihang Yuan
Yuzhang Shang
Yang Zhou
Zhen Dong
Zhe Zhou
...
Yong Jae Lee
Yan Yan
Beidi Chen
Guangyu Sun
Kurt Keutzer
583
145
0
26 Feb 2024
Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens
Huiping Zhuang
Jiahong Yu
Qianshi Pang
Zihao Wang
Huiping Zhuang
Cen Chen
Xiaofeng Zou
238
5
0
24 Feb 2024
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Zhuoming Chen
Avner May
Ruslan Svirschevski
Yuhsun Huang
Max Ryabinin
Zhihao Jia
Beidi Chen
327
67
0
19 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
333
235
0
03 Feb 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
International Conference on Machine Learning (ICML), 2024
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
534
302
0
26 Jan 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Heming Xia
Zhe Yang
Qingxiu Dong
Peiyi Wang
Chak Tou Leong
Tao Ge
Tianyu Liu
Wenjie Li
Zhifang Sui
LRM
425
201
0
15 Jan 2024
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Xupeng Miao
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
360
118
0
23 Dec 2023
Controlled Decoding from Language Models
International Conference on Machine Learning (ICML), 2023
Sidharth Mudgal
Jong Lee
H. Ganapathy
Yaguang Li
Tao Wang
...
Michael Collins
Trevor Strohman
Jilin Chen
Alex Beutel
Ahmad Beirami
443
113
0
25 Oct 2023
DistillSpec: Improving Speculative Decoding via Knowledge Distillation
International Conference on Learning Representations (ICLR), 2023
Yongchao Zhou
Kaifeng Lyu
A. S. Rawat
A. Menon
Afshin Rostamizadeh
Sanjiv Kumar
Jean-François Kagy
Rishabh Agarwal
214
122
0
12 Oct 2023
1