ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.15758
  4. Cited By
Chimera: A Lossless Decoding Method for Accelerating Large Language
  Models Inference by Fusing all Tokens

Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens

24 February 2024
Ziqian Zeng
Jiahong Yu
Qianshi Pang
Zihao W. Wang
Huiping Zhuang
Cen Chen
Xiaofeng Zou
ArXivPDFHTML

Papers citing "Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens"

7 / 7 papers shown
Title
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
Shijing Hu
Jingyang Li
Xingyu Xie
Zhihui Lu
Kim-Chuan Toh
Pan Zhou
38
0
0
16 Feb 2025
Closer Look at Efficient Inference Methods: A Survey of Speculative
  Decoding
Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding
Hyun Ryu
Eric Kim
72
3
0
20 Nov 2024
Clover-2: Accurate Inference for Regressive Lightweight Speculative
  Decoding
Clover-2: Accurate Inference for Regressive Lightweight Speculative Decoding
Bin Xiao
Lujun Gui
Lei Su
Weipeng Chen
18
2
0
01 Aug 2024
Clover: Regressive Lightweight Speculative Decoding with Sequential
  Knowledge
Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge
Bin Xiao
Chunan Shi
Xiaonan Nie
Fan Yang
Xiangwei Deng
Lei Su
Weipeng Chen
Bin Cui
21
6
0
01 May 2024
Break the Sequential Dependency of LLM Inference Using Lookahead
  Decoding
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
123
134
0
03 Feb 2024
The Falcon Series of Open Language Models
The Falcon Series of Open Language Models
Ebtesam Almazrouei
Hamza Alobeidli
Abdulaziz Alshamsi
Alessandro Cappelli
Ruxandra-Aimée Cojocaru
...
Quentin Malartic
Daniele Mazzotta
Badreddine Noune
B. Pannier
Guilherme Penedo
AI4TS
ALM
113
389
0
28 Nov 2023
On The Computational Complexity of Self-Attention
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
C. Hegde
63
107
0
11 Sep 2022
1