ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18350
34
0

Task Specific Pruning with LLM-Sieve: How Many Parameters Does Your Task Really Need?

23 May 2025
Waleed Reda
Abhinav Jangda
Krishna Chintalapudi
ArXivPDFHTML
Abstract

As Large Language Models (LLMs) are increasingly being adopted for narrow tasks - such as medical question answering or sentiment analysis - and deployed in resource-constrained settings, a key question arises: how many parameters does a task actually need? In this work, we present LLM-Sieve, the first comprehensive framework for task-specific pruning of LLMs that achieves 20-75% parameter reduction with only 1-5% accuracy degradation across diverse domains. Unlike prior methods that apply uniform pruning or rely on low-rank approximations of weight matrices or inputs in isolation, LLM-Sieve (i) learns task-aware joint projections to better approximate output behavior, and (ii) employs a Genetic Algorithm to discover differentiated pruning levels for each matrix. LLM-Sieve is fully compatible with LoRA fine-tuning and quantization, and uniquely demonstrates strong generalization across datasets within the same task domain. Together, these results establish a practical and robust mechanism to generate smaller performant task-specific models.

View on arXiv
@article{reda2025_2505.18350,
  title={ Task Specific Pruning with LLM-Sieve: How Many Parameters Does Your Task Really Need? },
  author={ Waleed Reda and Abhinav Jangda and Krishna Chintalapudi },
  journal={arXiv preprint arXiv:2505.18350},
  year={ 2025 }
}
Comments on this paper