Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.16758
Cited By
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters
24 June 2024
Euiin Yi
Taehyeon Kim
Hongseok Jeung
Du-Seong Chang
Se-Young Yun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters"
3 / 3 papers shown
Title
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer
Milan Gritta
Gerasimos Lampouras
Haitham Bou Ammar
Jun Wang
63
4
0
04 Oct 2024
Speculative Streaming: Fast LLM Inference without Auxiliary Models
Nikhil Bhendawade
Irina Belousova
Qichen Fu
Henry Mason
Mohammad Rastegari
Mahyar Najibi
LRM
24
27
0
16 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
120
134
0
03 Feb 2024
1