ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.06916
  4. Cited By
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration

9 October 2024
Heming Xia
Yongqi Li
Jun Zhang
Cunxiao Du
Wenjie Li
    LRM
ArXivPDFHTML

Papers citing "SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration"

3 / 3 papers shown
Title
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
Zihao An
Huajun Bai
Z. Liu
Dong Li
E. Barsoum
48
0
0
23 Apr 2025
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding
Hossein Entezari Zarch
Lei Gao
Chaoyi Jiang
Murali Annavaram
LRM
21
0
0
08 Apr 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
82
0
0
27 Feb 2025
1