Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12021
Cited By
Adaptive Draft-Verification for Efficient Large Language Model Decoding
27 June 2024
Xukun Liu
Bowen Lei
Ruqi Zhang
Dongkuan Xu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adaptive Draft-Verification for Efficient Large Language Model Decoding"
5 / 5 papers shown
Title
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Yeonhong Park
Jake Hyun
SangLyul Cho
Bonggeun Sim
Jae W. Lee
MQ
25
16
0
16 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
120
134
0
03 Feb 2024
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
Jijia Liu
Chao Yu
Jiaxuan Gao
Yuqing Xie
Qingmin Liao
Yi Wu
Yu Wang
LLMAG
LM&Ro
79
34
0
23 Dec 2023
A Study of Generative Large Language Model for Medical Research and Healthcare
C.A.I. Peng
Xi Yang
Aokun Chen
Kaleb E. Smith
Nima M. Pournejatian
...
W. Hogan
E. Shenkman
Yi Guo
Jiang Bian
Yonghui Wu
LM&MA
ELM
AI4MH
134
121
0
22 May 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
135
208
0
13 Mar 2023
1