ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.00858
  4. Cited By
Direct Alignment of Draft Model for Speculative Decoding with
  Chat-Fine-Tuned LLMs

Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs

29 February 2024
Raghavv Goel
Mukul Gagrani
Wonseok Jeon
Junyoung Park
Mingu Lee
Christopher Lott
    ALM
ArXivPDFHTML

Papers citing "Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs"

4 / 4 papers shown
Title
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments
Junyoung Park
Dalton Jones
Matt Morse
Raghavv Goel
Mingu Lee
Chris Lott
22
0
0
21 Apr 2025
Closer Look at Efficient Inference Methods: A Survey of Speculative
  Decoding
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding
Hyun Ryu
Eric Kim
72
3
0
20 Nov 2024
AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language
  Models via an Entropy-based Lower Bound on Token Acceptance Probability
AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
Sudhanshu Agrawal
Wonseok Jeon
Mingu Lee
16
0
0
24 Oct 2024
On Speculative Decoding for Multimodal Large Language Models
On Speculative Decoding for Multimodal Large Language Models
Mukul Gagrani
Raghavv Goel
Wonseok Jeon
Junyoung Park
Mingu Lee
Christopher Lott
LRM
19
6
0
13 Apr 2024
1