Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.06916
Cited By
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
9 October 2024
Heming Xia
Yongqi Li
Jun Zhang
Cunxiao Du
Wenjie Li
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration"
3 / 3 papers shown
Title
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
Zihao An
Huajun Bai
Z. Liu
Dong Li
E. Barsoum
48
0
0
23 Apr 2025
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Hossein Entezari Zarch
Lei Gao
Chaoyi Jiang
Murali Annavaram
LRM
21
0
0
08 Apr 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
82
0
0
27 Feb 2025
1