Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.10259
Cited By
SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained Devices
15 May 2025
Xiangwen Zhuge
Xu Shen
Zeyu Wang
Fan Dang
Xuan Ding
Danyang Li
Yahui Han
Tianxiang Hao
Z. Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SpecOffload: Unlocking Latent GPU Capacity for LLM Inference on Resource-Constrained Devices"
Title
No papers