Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2503.00491
Cited By
Tutorial Proposal: Speculative Decoding for Efficient LLM Inference
1 March 2025
Heming Xia
Cunxiao Du
Yongqian Li
Qian Liu
Wenjie Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Tutorial Proposal: Speculative Decoding for Efficient LLM Inference"
3 / 3 papers shown
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Zilong Wang
Zifeng Wang
Long Le
Huaixiu Steven Zheng
Swaroop Mishra
...
Anush Mattapalli
Ankur Taly
Jingbo Shang
Zifeng Wang
Tomas Pfister
RALM
329
74
0
11 Jul 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
International Conference on Machine Learning (ICML), 2024
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
590
319
0
26 Jan 2024
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
599
641
0
06 Nov 2019
1