Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.15012
Cited By
Inference-Friendly Models With MixAttention
23 September 2024
Shashank Rajput
Ying Sheng
Sean Owen
Vitaliy Chiley
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Inference-Friendly Models With MixAttention"
1 / 1 papers shown
Title
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Yuxiang Huang
Binhang Yuan
Xu Han
Chaojun Xiao
Zhiyuan Liu
RALM
59
1
0
02 Oct 2024
1