Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.05676
Cited By
Efficiency Unleashed: Inference Acceleration for LLM-based Recommender Systems with Speculative Decoding
11 August 2024
Yunjia Xi
Hangyu Wang
Bo Chen
Jianghao Lin
Menghui Zhu
W. Liu
Ruiming Tang
Zhewei Wei
W. Zhang
Yong Yu
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficiency Unleashed: Inference Acceleration for LLM-based Recommender Systems with Speculative Decoding"
2 / 2 papers shown
Title
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
55
0
0
27 Feb 2025
Efficient Inference for Large Language Model-based Generative Recommendation
Xinyu Lin
Chaoqun Yang
Wenjie Wang
Yongqi Li
Cunxiao Du
Fuli Feng
See-Kiong Ng
Tat-Seng Chua
34
1
0
07 Oct 2024
1