Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.20330
Cited By
Long-Context Inference with Retrieval-Augmented Speculative Decoding
27 February 2025
Guanzheng Chen
Qilong Feng
Jinjie Ni
Xin Li
Michael Shieh
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long-Context Inference with Retrieval-Augmented Speculative Decoding"
3 / 3 papers shown
Title
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Venkataramana Runkana
OffRL
43
1
0
02 Apr 2025
PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
Weisheng Jin
Maojia Song
Tej Deep Pala
Yew Ken Chia
Amir Zadeh
Chuan Li
Soujanya Poria
VLM
47
0
0
30 Mar 2025
GPU-Accelerated Motion Planning of an Underactuated Forestry Crane in Cluttered Environments
M. Vu
Gerald Ebmer
Alexander Watcher
Marc-Philip Ecker
Giang Nguyen
Tobias Glueck
57
0
0
18 Mar 2025
1