ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00774
  4. Cited By
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
v1v2v3 (latest)

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

International Conference on Machine Learning (ICML), 2023
2 January 2023
Elias Frantar
Dan Alistarh
    VLM
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)Github (799★)

Papers citing "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"

50 / 665 papers shown
Title
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference
Nikhil Bhendawade
K. Nishu
Arnav Kundu
Chris Bartels
Minsik Cho
Irina Belousova
LRM
280
0
0
15 Oct 2025
MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics
MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics
Bowei Guo
Shengkun Tang
Cong Zeng
Zhiqiang Shen
140
1
0
13 Oct 2025
MC#: Mixture Compressor for Mixture-of-Experts Large Models
MC#: Mixture Compressor for Mixture-of-Experts Large Models
Wei Huang
Yue Liao
Yukang Chen
Jianhui Liu
Haoru Tan
Si Liu
Shiming Zhang
Shuicheng Yan
Xiaojuan Qi
MoEMQ
204
0
0
13 Oct 2025
ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing
ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing
Shivanshu Kumar
Gopalakrishnan Srinivasan
80
0
0
13 Oct 2025
AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
Gunho Park
Jeongin Bae
Beomseok Kwon
Byeongwook Kim
S. Kwon
Dongsoo Lee
MQ
160
1
0
12 Oct 2025
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization
Bowei He
Lihao Yin
Huiling Zhen
Shuqi Liu
Han Wu
Xiaokun Zhang
Mingxuan Yuan
Chen Ma
96
0
0
12 Oct 2025
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models
Lancheng Zou
Shuo Yin
Zehua Pei
Tsung-Yi Ho
Farzan Farnia
Bei Yu
76
0
0
11 Oct 2025
RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model
RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model
Shuichiro Haruta
Kazunori Matsumoto
Zhi Li
Yanan Wang
Mori Kurokawa
118
0
0
09 Oct 2025
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md. Kowsher
Ali O. Polat
Ehsan Mohammady Ardehaly
Mehrdad Salehi
Zia Ghiasi
Prasanth Murali
Chen Chen
166
1
0
09 Oct 2025
AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
Xiaoshuang Ji
Zhendong Zhao
Xiaoyan Gu
Xiaojun Chen
Xin Zhao
Zeyao Liu
112
0
0
09 Oct 2025
Don't Run with Scissors: Pruning Breaks VLA Models but They Can Be Recovered
Don't Run with Scissors: Pruning Breaks VLA Models but They Can Be Recovered
Jason J. Jabbour
Dong-Ki Kim
Max Smith
Jay Patrikar
Radhika Ghosal
Youhui Wang
Ali Agha
Vijay Janapa Reddi
Shayegan Omidshafiei
VLM
124
1
0
09 Oct 2025
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
Kazuki Egashira
Robin Staab
Thibaud Gloaguen
Mark Vero
Martin Vechev
AAML
183
0
0
09 Oct 2025
Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form
Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form
Lorenzo Nikiforos
Charalampos Antoniadis
Luciano Prono
F. Pareschi
R. Rovatti
Gianluca Setti
112
0
0
09 Oct 2025
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation
Arjun Krishnakumar
R. Sukthanker
Hannan Javed Mahadik
Gabriela Kadlecová
Vladyslav Moroshan
Timur Carstensen
Frank Hutter
Aaron Klein
113
0
0
08 Oct 2025
OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot
OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot
Junhan Zhu
Hesong Wang
Mingluo Su
Zefang Wang
Huan Wang
DiffMVLM
227
1
0
08 Oct 2025
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Ryan Solgi
Parsa Madinei
Jiayi Tian
Rupak Vignesh Swaminathan
Jing Liu
Nathan Susanj
Zheng Zhang
78
1
0
07 Oct 2025
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes
Peter Ochieng
80
1
0
07 Oct 2025
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization
Lawrence Liu
Alexander Liu
Mengdi Wang
T. Zhao
Lin F. Yang
108
0
0
07 Oct 2025
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
Haoxin Wang
Xiaolong Tu
Hongyu Ke
Huirong Chai
Dawei Chen
Kyungtae Han
99
1
0
07 Oct 2025
Expand Neurons, Not Parameters
Expand Neurons, Not Parameters
Linghao Kong
Inimai Subramanian
Yonadav Shavit
Micah Adler
Dan Alistarh
Nir Shavit
108
0
0
06 Oct 2025
The Curious Case of In-Training Compression of State Space Models
The Curious Case of In-Training Compression of State Space Models
Makram Chahine
Philipp Nazari
Daniela Rus
T. Konstantin Rusch
167
0
0
03 Oct 2025
Small is Sufficient: Reducing the World AI Energy Consumption Through Model Selection
Small is Sufficient: Reducing the World AI Energy Consumption Through Model Selection
Tiago da Silva Barros
Frédéric Giroire
Ramon Aparicio-Pardo
Joanna Moulierac
120
0
0
02 Oct 2025
The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM
The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM
Kwanhee Lee
Hyeondo Jang
Dongyeop Lee
Dan Alistarh
Namhoon Lee
84
1
0
02 Oct 2025
Accelerating Attention with Basis Decomposition
Accelerating Attention with Basis Decomposition
Jialin Zhao
149
0
0
02 Oct 2025
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
Xin Yu
Cong Xie
Ziyu Zhao
Tiantian Fan
Lingzhou Xue
Zhi-Li Zhang
216
0
0
30 Sep 2025
CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models
CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models
Weiyu Huang
Yuezhou Hu
Jun Zhu
Jianfei Chen
CLL
100
0
0
30 Sep 2025
Collaborative Compression for Large-Scale MoE Deployment on Edge
Collaborative Compression for Large-Scale MoE Deployment on Edge
Yixiao Chen
Yanyue Xie
Ruining Yang
Wei Jiang
Wei Wang
Yong He
Yue Chen
Pu Zhao
Y. Wang
MQ
84
0
0
30 Sep 2025
Layer-wise dynamic rank for compressing large language models
Layer-wise dynamic rank for compressing large language models
Zhendong Mi
Bian Sun
Grace Li Zhang
Shaoyi Huang
ALM
187
0
0
30 Sep 2025
UniPruning: Unifying Local Metric and Global Feedback for Scalable Sparse LLMs
UniPruning: Unifying Local Metric and Global Feedback for Scalable Sparse LLMs
Yizhuo Ding
Wanying Qu
Jiawei Geng
Wenqi Shao
Yanwei Fu
136
0
0
29 Sep 2025
DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding
DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding
Guanghao Li
Zhihui Fu
Min Fang
Qibin Zhao
Ming Tang
Chun Yuan
Jun Wang
84
4
0
28 Sep 2025
A Second-Order Perspective on Pruning at Initialization and Knowledge Transfer
A Second-Order Perspective on Pruning at Initialization and Knowledge Transfer
Leonardo Iurada
Beatrice Occhiena
Tatiana Tommasi
VLM
135
0
0
28 Sep 2025
Differentiable Sparsity via $D$-Gating: Simple and Versatile Structured Penalization
Differentiable Sparsity via DDD-Gating: Simple and Versatile Structured Penalization
Chris Kolb
Laetitia Frost
J. Herbinger
David Rügamer
368
0
0
28 Sep 2025
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
Tianao Zhang
Zhiteng Li
Xianglong Yan
Haotong Qin
Yong Guo
Yulun Zhang
MQ
113
0
0
27 Sep 2025
COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
Dmitriy Shopkhoev
Denis Makhov
Magauiya Zhussip
Ammar Ali
Stamatios Lefkimmiatis
181
0
0
26 Sep 2025
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs
Shirin Alanova
Kristina Kazistova
Ekaterina Galaeva
Alina Kostromina
Vladimir Smirnov
Redko Dmitry
Alexey Dontsov
Maxim Zhelnin
Evgeny Burnaev
Egor Shvetsov
136
0
0
26 Sep 2025
StructPrune: Structured Global Pruning asymptotics with $\mathcal{O}(\sqrt{N})$ GPU Memory
StructPrune: Structured Global Pruning asymptotics with O(N)\mathcal{O}(\sqrt{N})O(N​) GPU Memory
Xinyuan Song
Guangji Bai
Bo Pan
112
0
0
25 Sep 2025
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
Zukang Xu
Yan Chen
Qiang Wu
Dawei Yang
MQ
218
0
0
24 Sep 2025
NIRVANA: Structured pruning reimagined for large language models compression
NIRVANA: Structured pruning reimagined for large language models compression
Mengting Ai
Tianxin Wei
Sirui Chen
Jingrui He
VLM
1.6K
1
0
17 Sep 2025
FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction
FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction
Yuxuan Cai
Xiaozhuan Liang
X. Wang
Jin Ma
Haijin Liang
Jinwen Luo
Xinyu Zuo
Lisheng Duan
Yuyang Yin
Xi Chen
142
1
0
16 Sep 2025
Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction
Ryan Lucas
Kayhan Behdin
Zhipeng Wang
Qingquan Song
Shao Tang
Rahul Mazumder
ReLMLRMAI4CE
129
0
0
15 Sep 2025
Harnessing Optimization Dynamics for Curvature-Informed Model Merging
Harnessing Optimization Dynamics for Curvature-Informed Model Merging
Pouria Mahdavinia
Hamed Mahdavi
Niloofar Mireshghallah
M. Mahdavi
MoMe
167
1
0
14 Sep 2025
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
Hang Guo
Yawei Li
Luca Benini
MQ
191
0
0
14 Sep 2025
GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings
GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings
Yixuan Tang
Yi Yang
120
0
0
13 Sep 2025
Unified Start, Personalized End: Progressive Pruning for Efficient 3D Medical Image Segmentation
Unified Start, Personalized End: Progressive Pruning for Efficient 3D Medical Image Segmentation
Linhao Li
Yiwen Ye
Ziyang Chen
Yong-quan Xia
MedIm
84
0
0
11 Sep 2025
COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
Eugene Kwek
Wenpeng Yin
VLM
236
0
0
08 Sep 2025
Delta Activations: A Representation for Finetuned Large Language Models
Delta Activations: A Representation for Finetuned Large Language Models
Zhiqiu Xu
Amish Sethi
Mayur Naik
Ser-Nam Lim
138
0
0
04 Sep 2025
From Injection to Defense: Constructing Edit-Based Fingerprints for Large Language Models
From Injection to Defense: Constructing Edit-Based Fingerprints for Large Language Models
Yue Li
Xin Yi
Dongsheng Shi
Yongyi Cui
Gerard de Melo
Xiaoling Wang
KELMAAML
166
1
0
03 Sep 2025
LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference
LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference
Krishna Teja Chitty-Venkata
Sandeep Madireddy
M. Emani
V. Vishwanath
MoE
151
1
0
02 Sep 2025
Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance
Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance
Yao Wang
Di Liang
Minlong Peng
MoMe
345
4
0
29 Aug 2025
Towards On-Device Personalization: Cloud-device Collaborative Data Augmentation for Efficient On-device Language Model
Towards On-Device Personalization: Cloud-device Collaborative Data Augmentation for Efficient On-device Language Model
Zhaofeng Zhong
Wei Yuan
Liang Qu
Tong Chen
Hao Wang
Xiangyu Zhao
Hongzhi Yin
106
0
0
29 Aug 2025
Previous
12345...121314
Next