Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.01955
Cited By
S2D: Sorted Speculative Decoding For More Efficient Deployment of Nested Large Language Models
2 July 2024
Parsa Kavehzadeh
Mohammadreza Pourreza
Mojtaba Valipour
Tinashu Zhu
Haoli Bai
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"S2D: Sorted Speculative Decoding For More Efficient Deployment of Nested Large Language Models"
2 / 2 papers shown
Title
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
123
134
0
03 Feb 2024
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference
Parsa Kavehzadeh
Mojtaba Valipour
Marzieh S. Tahaei
Ali Ghodsi
Boxing Chen
Mehdi Rezagholizadeh
22
6
0
16 Sep 2023
1