ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.12639
  4. Cited By
Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree

Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree

17 December 2024
Xiangxiang Gao
Weisheng Xie
Yiwei Xiang
Feng Ji
ArXivPDFHTML

Papers citing "Falcon: Faster and Parallel Inference of Large Language Models through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree"

4 / 4 papers shown
Title
RASD: Retrieval-Augmented Speculative Decoding
Guofeng Quan
Wenfeng Feng
Chuzhan Hao
Guochao Jiang
Yuewei Zhang
Hao Wang
RALM
68
1
0
05 Mar 2025
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
106
3
0
03 Mar 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
85
0
0
27 Feb 2025
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition
Artem Basharin
Andrei Chertkov
Ivan V. Oseledets
32
1
0
23 Oct 2024
1