ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00774
  4. Cited By
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
v1v2v3 (latest)

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

International Conference on Machine Learning (ICML), 2023
2 January 2023
Elias Frantar
Dan Alistarh
    VLM
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)Github (799★)

Papers citing "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"

50 / 664 papers shown
Title
PATCH: Learnable Tile-level Hybrid Sparsity for LLMs
PATCH: Learnable Tile-level Hybrid Sparsity for LLMs
Younes Hourri
Mohammad Mozaffari
M. Dehnavi
188
0
0
24 Dec 2025
Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models
Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models
Ziyan Wang
Enmao Diao
Qi Le
Pu Wang
G. Wang
Minwoo Lee
Shu-ping Yeh
Li Yang
ReLMLRMVLM
104
0
0
01 Dec 2025
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Wentao Hu
Mingkuan Zhao
Shuangyong Song
Xiaoyan Zhu
Xin Lai
Jiayin Wang
99
1
0
25 Nov 2025
EfficientXpert: Efficient Domain Adaptation for Large Language Models via Propagation-Aware Pruning
EfficientXpert: Efficient Domain Adaptation for Large Language Models via Propagation-Aware Pruning
Songlin Zhao
Michael Pitts
Zhuwei Qin
76
0
0
25 Nov 2025
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
Z. J. Wang
Chang Che
Qi Wang
Hui Ma
Zenglin Shi
Cees G. M. Snoek
Meng Wang
CLL
168
0
0
25 Nov 2025
ModHiFi: Identifying High Fidelity predictive components for Model Modification
ModHiFi: Identifying High Fidelity predictive components for Model Modification
Dhruva Kashyap
Chaitanya Murti
Pranav K Nayak
Tanay Narshana
Chiranjib Bhattacharyya
112
0
0
24 Nov 2025
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
Xin Yuan
S. Li
Jiateng Wei
Chengrui Zhu
Yanming Wu
Qingpeng Li
Jiajun Lv
Xiaoke Lan
Jun Chen
Yong-Jin Liu
OffRL
348
0
0
24 Nov 2025
Think Before You Prune: Selective Self-Generated Calibration for Pruning Large Reasoning Models
Think Before You Prune: Selective Self-Generated Calibration for Pruning Large Reasoning Models
Yang Xiang
Yixin Ji
Juntao Li
Min Zhang
LRM
84
0
0
24 Nov 2025
Towards Efficient VLMs: Information-Theoretic Driven Compression via Adaptive Structural Pruning
Towards Efficient VLMs: Information-Theoretic Driven Compression via Adaptive Structural Pruning
Zhaoqi Xu
Yingying Zhang
Jian Li
Jianwei Guo
Qiannan Zhu
Hua Huang
VLM
52
0
0
24 Nov 2025
INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models
INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models
Parsa Madinei
Ryan Solgi
Ziqi Wen
Jonathan Skaza
Miguel P. Eckstein
Ramtin Pedarsani
VLM
153
0
0
24 Nov 2025
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
Pinaki Prasad Guha Neogi
Ahmad Mohammadshirazi
Dheeraj Kulshrestha
R. Ramnath
MoE
108
0
0
22 Nov 2025
E$^3$-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models
E3^33-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models
Tao Yuan
Haoli Bai
Yinfei Pan
Xuyang Cao
Tianyu Zhang
Lu Hou
Ting Hu
Xianzhi Yu
VLM
175
0
0
21 Nov 2025
Pluggable Pruning with Contiguous Layer Distillation for Diffusion Transformers
Jian Ma
Qirong Peng
Xujie Zhu
Peixing Xie
Chen Chen
H. Lu
111
0
0
20 Nov 2025
Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models
Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models
Haidong Kang
Lihong Lin
Enneng Yang
Hongning Dai
Hao Wang
LRM
193
0
0
19 Nov 2025
PocketLLM: Ultimate Compression of Large Language Models via Meta Networks
PocketLLM: Ultimate Compression of Large Language Models via Meta Networks
Ye Tian
Chengcheng Wang
Jing Han
Yehui Tang
Kai Han
MQ
100
0
0
19 Nov 2025
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Keda Tao
Kele Shao
Bohan Yu
Weiqiang Wang
Jian Liu
Huan Wang
VLM
229
2
0
18 Nov 2025
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
Vladimír Macko
Vladimír Boža
104
0
0
17 Nov 2025
Weight-sparse transformers have interpretable circuits
Weight-sparse transformers have interpretable circuits
Leo Gao
Achyuta Rajaram
Jacob Coxon
Soham V. Govande
Bowen Baker
Dan Mossing
MILM
208
3
0
17 Nov 2025
TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone
TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone
Xunjie Wang
Jiacheng Shi
Zihan Zhao
Yang Yu
Zhichao Hua
Jinyu Gu
89
0
0
17 Nov 2025
Efficient Mathematical Reasoning Models via Dynamic Pruning and Knowledge Distillation
Efficient Mathematical Reasoning Models via Dynamic Pruning and Knowledge Distillation
Fengming Yu
Qingyu Meng
Haiwei Pan
Kejia Zhang
LRM
120
0
0
15 Nov 2025
$A^3$: Attention-Aware Accurate KV Cache Fusion for Fast Large Language Model Serving
A3A^3A3: Attention-Aware Accurate KV Cache Fusion for Fast Large Language Model Serving
Yuechi Zhou
Yi Su
J. Zhang
Juntao Li
Qingrong Xia
Zhefeng Wang
Xinyu Duan
Baoxing Huai
71
1
0
13 Nov 2025
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Suqing Wang
Ziyang Ma
Xinyi Li
Zuchao Li
107
0
0
09 Nov 2025
EcoSpa: Efficient Transformer Training with Coupled Sparsity
EcoSpa: Efficient Transformer Training with Coupled Sparsity
Jinqi Xiao
Cheng Luo
Lingyi Huang
Cheng Yang
Yang Sui
...
Xiao Zang
Yibiao Ying
Zhexiang Tang
A. Anandkumar
Bo Yuan
60
1
0
09 Nov 2025
APP: Accelerated Path Patching with Task-Specific Pruning
APP: Accelerated Path Patching with Task-Specific Pruning
Frauke Andersen
William Rudman
Ruochen Zhang
Carsten Eickhoff
52
0
0
07 Nov 2025
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
Michael Menezes
Barbara Su
Xinze Feng
Yehya Farhat
Hamza Shili
Anastasios Kyrillidis
164
1
0
06 Nov 2025
IG-Pruning: Input-Guided Block Pruning for Large Language Models
IG-Pruning: Input-Guided Block Pruning for Large Language Models
Kangyu Qiao
Shaolei Zhang
Yang Feng
VLM
209
0
0
04 Nov 2025
Optimal Singular Damage: Efficient LLM Inference in Low Storage Regimes
Optimal Singular Damage: Efficient LLM Inference in Low Storage Regimes
Mohammadsajad Alipour
Mohammad Mohammadi Amiri
72
0
0
04 Nov 2025
Continual Learning, Not Training: Online Adaptation For Agents
Continual Learning, Not Training: Online Adaptation For Agents
Aman Jaglan
Jarrod Barnes
CLL
165
0
0
02 Nov 2025
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
David McCoy
Yulun Wu
Zachary Butzin-Dozier
100
0
0
02 Nov 2025
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
Zeliang Zong
Kai Zhang
Zheyang Li
Wenming Tan
Ye Ren
Yiyan Zhai
Jilin Hu
120
0
0
30 Oct 2025
NeuronMM: High-Performance Matrix Multiplication for LLM Inference on AWS Trainium
NeuronMM: High-Performance Matrix Multiplication for LLM Inference on AWS Trainium
Dinghong Song
Jierui Xu
Weichu Yang
Pengfei Su
Dong Li
138
0
0
29 Oct 2025
FlowMM: Cross-Modal Information Flow Guided KV Cache Merging for Efficient Multimodal Context Inference
FlowMM: Cross-Modal Information Flow Guided KV Cache Merging for Efficient Multimodal Context Inference
Kunxi Li
Yufan Xiong
Zhonghua Jiang
Yiyun Zhou
Zhaode Wang
Chengfei Lv
Shengyu Zhang
183
0
0
29 Oct 2025
PRO: Enabling Precise and Robust Text Watermark for Open-Source LLMs
PRO: Enabling Precise and Robust Text Watermark for Open-Source LLMs
Jiaqi Xue
Yifei Zhao
Mansour Al Ghanim
Shangqian Gao
Ruimin Sun
Qian Lou
Mengxin Zheng
WaLM
340
0
0
27 Oct 2025
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
Xinhai Wang
Shu Yang
Liangyu Wang
L. Zhang
Huanyi Xie
Lijie Hu
Di Wang
161
2
0
27 Oct 2025
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
Divya J. Bajpai
M. Hanawal
MLLMVLM
198
0
0
26 Oct 2025
TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
Omar Naim
Krish Sharma
Nicholas M. Asher
80
0
0
26 Oct 2025
Frustratingly Easy Task-aware Pruning for Large Language Models
Frustratingly Easy Task-aware Pruning for Large Language Models
Yuanhe Tian
Junjie Liu
Xican Yang
Haishan Ye
Yan Song
117
1
0
26 Oct 2025
Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search
Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search
Kayhan Behdin
Qingquan Song
Sriram Vasudevan
Jian Sheng
Xiaojing Ma
...
V. Sodha
Qi Guo
Caleb Johnson
Zhipeng Wang
Fedor Borisyuk
123
1
0
25 Oct 2025
The Structural Scalpel: Automated Contiguous Layer Pruning for Large Language Models
The Structural Scalpel: Automated Contiguous Layer Pruning for Large Language Models
Yao Lu
Yuqi Li
Wenbin Xie
Shanqing Yu
Qi Xuan
Zhaowei Zhu
Shiping Wen
76
1
0
25 Oct 2025
CPSVD: Enhancing Large Language Model Compression via Column-Preserving Singular Value Decomposition
CPSVD: Enhancing Large Language Model Compression via Column-Preserving Singular Value Decomposition
Lin Xv
Jingsheng Gao
Xian Gao
Ting Li
Yuzhuo Fu
52
0
0
22 Oct 2025
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Liu
Yuzhuo Fu
96
0
0
22 Oct 2025
Restoring Pruned Large Language Models via Lost Component Compensation
Restoring Pruned Large Language Models via Lost Component Compensation
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Tianjiao Li
Jia Jim Deryl Chua
Lee Onn Mak
Gee Wah Ng
Kezhi Mao
129
0
0
22 Oct 2025
The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
Hoang Pham
T. Ta
Tom Jacobs
R. Burkholz
Long Tran-Thanh
112
0
0
20 Oct 2025
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
Ziyan Wang
Enmao Diao
Qi Le
Pu Wang
Minwoo Lee
Shu-ping Yeh
Evgeny Stupachenko
Hao Feng
Li Yang
120
1
0
20 Oct 2025
Elastic ViTs from Pretrained Models without Retraining
Elastic ViTs from Pretrained Models without Retraining
Walter Simoncini
Michael Dorkenwald
Tijmen Blankevoort
Cees G. M. Snoek
Yuki Markus Asano
VLM
115
0
0
20 Oct 2025
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
195
0
0
19 Oct 2025
Synera: Synergistic LLM Serving across Device and Cloud at Scale
Synera: Synergistic LLM Serving across Device and Cloud at Scale
Genglin Wang
Liekang Zeng
Bufang Yang
Kaiwei Liu
Guoliang Xing
Chumin Sun
Li Zhou
Jie Sun
Zhenyu Yan
82
0
0
17 Oct 2025
A Free Lunch in LLM Compression: Revisiting Retraining after Pruning
A Free Lunch in LLM Compression: Revisiting Retraining after Pruning
Moritz Wagner
Christophe Roux
Max Zimmer
Sebastian Pokutta
48
0
0
16 Oct 2025
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Nikhil Bhendawade
K. Nishu
Arnav Kundu
Chris Bartels
Minsik Cho
Irina Belousova
LRM
232
0
0
15 Oct 2025
Don't Be Greedy, Just Relax! Pruning LLMs via Frank-Wolfe
Don't Be Greedy, Just Relax! Pruning LLMs via Frank-Wolfe
Christophe Roux
Max Zimmer
Alexandre d’Aspremont
Sebastian Pokutta
120
0
0
15 Oct 2025
1234...121314
Next