ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.12490
  4. Cited By
CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling
  Acceleration in LLMs

CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs

19 September 2024
Junlin Lv
Yuan Feng
Xike Xie
Xin Jia
Qirong Peng
Guiming Xie
ArXivPDFHTML

Papers citing "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs"

3 / 3 papers shown
Title
Identify Critical KV Cache in LLM Inference from an Output Perturbation Perspective
Identify Critical KV Cache in LLM Inference from an Output Perturbation Perspective
Yuan Feng
Junlin Lv
Y. Cao
Xike Xie
S.Kevin Zhou
71
2
0
06 Feb 2025
KV Prediction for Improved Time to First Token
KV Prediction for Improved Time to First Token
Maxwell Horton
Qingqing Cao
Chenfan Sun
Yanzi Jin
Sachin Mehta
Mohammad Rastegari
Moin Nabi
AI4TS
20
1
0
10 Oct 2024
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Yuxiang Huang
Binhang Yuan
Xu Han
Chaojun Xiao
Zhiyuan Liu
RALM
69
1
0
02 Oct 2024
1