ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.01805
  4. Cited By
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices

Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices

2 October 2024
Yuxiang Huang
Binhang Yuan
Xu Han
Chaojun Xiao
Zhiyuan Liu
    RALM
ArXivPDFHTML

Papers citing "Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices"

3 / 3 papers shown
Title
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
Yuxiang Huang
Mingye Li
Xu Han
Chaojun Xiao
Weilin Zhao
Sun Ao
Hao Zhou
Jie Zhou
Zhiyuan Liu
Maosong Sun
37
0
0
17 Feb 2025
Exploring the Benefit of Activation Sparsity in Pre-training
Exploring the Benefit of Activation Sparsity in Pre-training
Zhengyan Zhang
Chaojun Xiao
Qiujieli Qin
Yankai Lin
Zhiyuan Zeng
Xu Han
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Jie Zhou
MoE
58
3
0
04 Oct 2024
Inference-Friendly Models With MixAttention
Inference-Friendly Models With MixAttention
Shashank Rajput
Ying Sheng
Sean Owen
Vitaliy Chiley
71
1
0
23 Sep 2024
1