ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.15443
  4. Cited By
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models

When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models

24 February 2025
Weilan Wang
Yu Mao
Dongdong Tang
Hongchao Du
Nan Guan
Chun Jason Xue
    MQ
ArXivPDFHTML

Papers citing "When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models"

1 / 1 papers shown
Title
FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference
Hongchao Du
Shangyu Wu
Arina Kharlamova
Nan Guan
Chun Jason Xue
49
1
0
04 Mar 2025
1