Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2508.06447
Cited By

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning

v1v2 (latest)

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning

8 August 2025

ArXiv (abs)PDF HTML Github (3★)

Papers citing "SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning"

3 / 3 papers shown

GRATING: Low-Latency and Memory-Efficient Semantic Selection on Device

GRATING: Low-Latency and Memory-Efficient Semantic Selection on Device

139

0

0

17 Oct 2025

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

...

147

5

0

28 Sep 2025

PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference

PDTrim: Targeted Pruning for Prefill-Decode Disaggregation in Inference

471

2

0

29 Aug 2025