ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00774
  4. Cited By
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
v1v2v3 (latest)

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

International Conference on Machine Learning (ICML), 2023
2 January 2023
Elias Frantar
Dan Alistarh
    VLM
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)Github (799★)

Papers citing "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"

50 / 665 papers shown
PATCH: Learnable Tile-level Hybrid Sparsity for LLMs
PATCH: Learnable Tile-level Hybrid Sparsity for LLMs
Younes Hourri
Mohammad Mozaffari
M. Dehnavi
216
0
0
24 Dec 2025
Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates
Atsuki Yamaguchi
Terufumi Morishita
Aline Villavicencio
Nikolaos Aletras
CLL
220
0
0
04 Dec 2025
Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models
Think Before You Prune: Self-Reflective Structured Pruning for Reasoning Language Models
Ziyan Wang
Enmao Diao
Qi Le
Pu Wang
G. Wang
Minwoo Lee
Shu-ping Yeh
Li Yang
ReLMLRMVLM
128
0
0
01 Dec 2025
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs
Z. J. Wang
Chang Che
Qi Wang
Hui Ma
Zenglin Shi
Cees G. M. Snoek
Meng Wang
CLL
201
0
0
25 Nov 2025
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Mosaic Pruning: A Hierarchical Framework for Generalizable Pruning of Mixture-of-Experts Models
Wentao Hu
Mingkuan Zhao
Shuangyong Song
Xiaoyan Zhu
Xin Lai
Jiayin Wang
132
2
0
25 Nov 2025
EfficientXpert: Efficient Domain Adaptation for Large Language Models via Propagation-Aware Pruning
EfficientXpert: Efficient Domain Adaptation for Large Language Models via Propagation-Aware Pruning
Songlin Zhao
Michael Pitts
Zhuwei Qin
89
0
0
25 Nov 2025
INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models
INTERLACE: Interleaved Layer Pruning and Efficient Adaptation in Large Vision-Language Models
Parsa Madinei
Ryan Solgi
Ziqi Wen
Jonathan Skaza
Miguel P. Eckstein
Ramtin Pedarsani
VLM
183
0
0
24 Nov 2025
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning
Xin Yuan
S. Li
Jiateng Wei
Chengrui Zhu
Yanming Wu
Qingpeng Li
Jiajun Lv
Xiaoke Lan
Jun Chen
Yong-Jin Liu
OffRL
377
0
0
24 Nov 2025
ModHiFi: Identifying High Fidelity predictive components for Model Modification
ModHiFi: Identifying High Fidelity predictive components for Model Modification
Dhruva Kashyap
Chaitanya Murti
Pranav K Nayak
Tanay Narshana
Chiranjib Bhattacharyya
138
0
0
24 Nov 2025
Towards Efficient VLMs: Information-Theoretic Driven Compression via Adaptive Structural Pruning
Towards Efficient VLMs: Information-Theoretic Driven Compression via Adaptive Structural Pruning
Zhaoqi Xu
Yingying Zhang
Jian Li
Jianwei Guo
Qiannan Zhu
Hua Huang
VLM
84
0
0
24 Nov 2025
Think Before You Prune: Selective Self-Generated Calibration for Pruning Large Reasoning Models
Think Before You Prune: Selective Self-Generated Calibration for Pruning Large Reasoning Models
Yang Xiang
Yixin Ji
Juntao Li
Min Zhang
LRM
108
0
0
24 Nov 2025
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
Exploiting the Experts: Unauthorized Compression in MoE-LLMs
Pinaki Prasad Guha Neogi
Ahmad Mohammadshirazi
Dheeraj Kulshrestha
R. Ramnath
MoE
147
0
0
22 Nov 2025
E$^3$-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models
E3^33-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models
Tao Yuan
Haoli Bai
Yinfei Pan
Xuyang Cao
Tianyu Zhang
Lu Hou
Ting Hu
Xianzhi Yu
VLM
233
0
0
21 Nov 2025
Pluggable Pruning with Contiguous Layer Distillation for Diffusion Transformers
Jian Ma
Qirong Peng
Xujie Zhu
Peixing Xie
Chen Chen
H. Lu
134
1
0
20 Nov 2025
PocketLLM: Ultimate Compression of Large Language Models via Meta Networks
PocketLLM: Ultimate Compression of Large Language Models via Meta Networks
Ye Tian
Chengcheng Wang
Jing Han
Yehui Tang
Kai Han
MQ
124
0
0
19 Nov 2025
Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models
Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models
Haidong Kang
Lihong Lin
Enneng Yang
Hongning Dai
Hao Wang
LRM
222
0
0
19 Nov 2025
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
Keda Tao
Kele Shao
Bohan Yu
Weiqiang Wang
Jian Liu
Huan Wang
VLM
256
2
0
18 Nov 2025
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity
Vladimír Macko
Vladimír Boža
136
0
0
17 Nov 2025
Weight-sparse transformers have interpretable circuits
Weight-sparse transformers have interpretable circuits
Leo Gao
Achyuta Rajaram
Jacob Coxon
Soham V. Govande
Bowen Baker
Dan Mossing
MILM
232
6
0
17 Nov 2025
TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone
TZ-LLM: Protecting On-Device Large Language Models with Arm TrustZone
Xunjie Wang
Jiacheng Shi
Zihan Zhao
Yang Yu
Zhichao Hua
Jinyu Gu
99
0
0
17 Nov 2025
Efficient Mathematical Reasoning Models via Dynamic Pruning and Knowledge Distillation
Efficient Mathematical Reasoning Models via Dynamic Pruning and Knowledge Distillation
Fengming Yu
Qingyu Meng
Haiwei Pan
Kejia Zhang
LRM
148
0
0
15 Nov 2025
$A^3$: Attention-Aware Accurate KV Cache Fusion for Fast Large Language Model Serving
A3A^3A3: Attention-Aware Accurate KV Cache Fusion for Fast Large Language Model Serving
Yuechi Zhou
Yi Su
J. Zhang
Juntao Li
Qingrong Xia
Zhefeng Wang
Xinyu Duan
Baoxing Huai
85
1
0
13 Nov 2025
EcoSpa: Efficient Transformer Training with Coupled Sparsity
EcoSpa: Efficient Transformer Training with Coupled Sparsity
Jinqi Xiao
Cheng Luo
Lingyi Huang
Cheng Yang
Yang Sui
...
Xiao Zang
Yibiao Ying
Zhexiang Tang
A. Anandkumar
Bo Yuan
82
1
0
09 Nov 2025
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Suqing Wang
Ziyang Ma
Xinyi Li
Zuchao Li
160
0
0
09 Nov 2025
APP: Accelerated Path Patching with Task-Specific Pruning
APP: Accelerated Path Patching with Task-Specific Pruning
Frauke Andersen
William Rudman
Ruochen Zhang
Carsten Eickhoff
69
0
0
07 Nov 2025
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
Michael Menezes
Barbara Su
Xinze Feng
Yehya Farhat
Hamza Shili
Anastasios Kyrillidis
172
1
0
06 Nov 2025
IG-Pruning: Input-Guided Block Pruning for Large Language Models
IG-Pruning: Input-Guided Block Pruning for Large Language Models
Kangyu Qiao
Shaolei Zhang
Yang Feng
VLM
229
0
0
04 Nov 2025
Optimal Singular Damage: Efficient LLM Inference in Low Storage Regimes
Optimal Singular Damage: Efficient LLM Inference in Low Storage Regimes
Mohammadsajad Alipour
Mohammad Mohammadi Amiri
100
0
0
04 Nov 2025
Continual Learning, Not Training: Online Adaptation For Agents
Continual Learning, Not Training: Online Adaptation For Agents
Aman Jaglan
Jarrod Barnes
CLL
193
0
0
02 Nov 2025
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
AI Progress Should Be Measured by Capability-Per-Resource, Not Scale Alone: A Framework for Gradient-Guided Resource Allocation in LLMs
David McCoy
Yulun Wu
Zachary Butzin-Dozier
123
0
0
02 Nov 2025
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
Zeliang Zong
Kai Zhang
Zheyang Li
Wenming Tan
Ye Ren
Yiyan Zhai
Jilin Hu
146
0
0
30 Oct 2025
NeuronMM: High-Performance Matrix Multiplication for LLM Inference on AWS Trainium
NeuronMM: High-Performance Matrix Multiplication for LLM Inference on AWS Trainium
Dinghong Song
Jierui Xu
Weichu Yang
Pengfei Su
Dong Li
167
0
0
29 Oct 2025
FlowMM: Cross-Modal Information Flow Guided KV Cache Merging for Efficient Multimodal Context Inference
FlowMM: Cross-Modal Information Flow Guided KV Cache Merging for Efficient Multimodal Context Inference
Kunxi Li
Yufan Xiong
Zhonghua Jiang
Yiyun Zhou
Zhaode Wang
Chengfei Lv
Shengyu Zhang
201
0
0
29 Oct 2025
PRO: Enabling Precise and Robust Text Watermark for Open-Source LLMs
PRO: Enabling Precise and Robust Text Watermark for Open-Source LLMs
Jiaqi Xue
Yifei Zhao
Mansour Al Ghanim
Shangqian Gao
Ruimin Sun
Qian Lou
Mengxin Zheng
WaLM
401
0
0
27 Oct 2025
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
Xinhai Wang
Shu Yang
Liangyu Wang
L. Zhang
Huanyi Xie
Lijie Hu
Di Wang
194
2
0
27 Oct 2025
Frustratingly Easy Task-aware Pruning for Large Language Models
Frustratingly Easy Task-aware Pruning for Large Language Models
Yuanhe Tian
Junjie Liu
Xican Yang
Haishan Ye
Yan Song
142
1
0
26 Oct 2025
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference
Divya J. Bajpai
M. Hanawal
MLLMVLM
211
0
0
26 Oct 2025
TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
Omar Naim
Krish Sharma
Nicholas M. Asher
Nicholas Asher
92
0
0
26 Oct 2025
Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search
Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search
Kayhan Behdin
Qingquan Song
Sriram Vasudevan
Jian Sheng
Xiaojing Ma
...
V. Sodha
Qi Guo
Caleb Johnson
Zhipeng Wang
Fedor Borisyuk
139
1
0
25 Oct 2025
The Structural Scalpel: Automated Contiguous Layer Pruning for Large Language Models
The Structural Scalpel: Automated Contiguous Layer Pruning for Large Language Models
Yao Lu
Yuqi Li
Wenbin Xie
Shanqing Yu
Qi Xuan
Zhaowei Zhu
Shiping Wen
89
1
0
25 Oct 2025
Beyond Uniform SVD:Dual-Level Optimization across Columns and Modules for LLM Compression
Beyond Uniform SVD:Dual-Level Optimization across Columns and Modules for LLM Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Li
75
0
0
22 Oct 2025
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
Lin Xv
Jingsheng Gao
Xian Gao
Ting Liu
Yuzhuo Fu
124
0
0
22 Oct 2025
Restoring Pruned Large Language Models via Lost Component Compensation
Restoring Pruned Large Language Models via Lost Component Compensation
Zijian Feng
Hanzhang Zhou
Zixiao Zhu
Tianjiao Li
Jia Jim Deryl Chua
Lee Onn Mak
Gee Wah Ng
Kezhi Mao
141
0
0
22 Oct 2025
Elastic ViTs from Pretrained Models without Retraining
Elastic ViTs from Pretrained Models without Retraining
Walter Simoncini
Michael Dorkenwald
Tijmen Blankevoort
Cees G. M. Snoek
Yuki Markus Asano
VLM
148
0
0
20 Oct 2025
The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
Hoang Pham
T. Ta
Tom Jacobs
R. Burkholz
Long Tran-Thanh
142
0
0
20 Oct 2025
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
From Local to Global: Revisiting Structured Pruning Paradigms for Large Language Models
Ziyan Wang
Enmao Diao
Qi Le
Pu Wang
Minwoo Lee
Shu-ping Yeh
Evgeny Stupachenko
Hao Feng
Li Yang
139
1
0
20 Oct 2025
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
238
0
0
19 Oct 2025
Synera: Synergistic LLM Serving across Device and Cloud at Scale
Synera: Synergistic LLM Serving across Device and Cloud at Scale
Genglin Wang
Liekang Zeng
Bufang Yang
Kaiwei Liu
Guoliang Xing
Chumin Sun
Li Zhou
Jie Sun
Zhenyu Yan
113
0
0
17 Oct 2025
A Free Lunch in LLM Compression: Revisiting Retraining after Pruning
A Free Lunch in LLM Compression: Revisiting Retraining after Pruning
Moritz Wagner
Christophe Roux
Max Zimmer
Sebastian Pokutta
74
0
0
16 Oct 2025
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Nikhil Bhendawade
K. Nishu
Arnav Kundu
Chris Bartels
Minsik Cho
Irina Belousova
LRM
332
0
0
15 Oct 2025
1234...121314
Next
Page 1 of 14
Pageof 14