ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.06447
  4. Cited By
SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning
v1v2 (latest)

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning

8 August 2025
Junyi Chen
Rubing Yang
Yushi Huang
Desheng Hui
Ao Zhou
Jianlei Yang
ArXiv (abs)PDFHTMLGithub (3★)

Papers citing "SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning"

3 / 3 papers shown
GRATING: Low-Latency and Memory-Efficient Semantic Selection on Device
GRATING: Low-Latency and Memory-Efficient Semantic Selection on Device
Jiahao Zhou
Chengliang Lin
Dingji Li
Mingkai Dong
Haibo Chen
139
0
0
17 Oct 2025
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning
Shaobo Wang
Jiaming Wang
Jiajun Zhang
C. Wang
Yue Min
...
Fei Huang
Huiqiang Jiang
Junyang Lin
Dayiheng Liu
Linfeng Zhang
147
5
0
28 Sep 2025
PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference
PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference
Hao Zhang
Mengsi Lyu
Zhuo Chen
Xingrun Xing
Yulong Ao
Yonghua Lin
471
2
0
29 Aug 2025
1