v1v2v3 (latest)

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

International Conference on Machine Learning (ICML), 2023

2 January 2023

Elias Frantar

Dan Alistarh

VLM

ArXiv (abs)PDF HTML HuggingFace (3 upvotes)Github (799★)

Papers citing "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"

50 / 665 papers shown

Mirror Speculative Decoding: Breaking the Serial Barrier in LLM Inference

329

15 Oct 2025

MosaicDiff: Training-free Structural Pruning for Diffusion Model Acceleration Reflecting Pretraining Dynamics

155

13 Oct 2025

MC#: Mixture Compressor for Mixture-of-Experts Large Models

208

13 Oct 2025

ShishuLM: Lightweight Language Model with Hybrid Decoder-MLP Architecture and Paired Weight Sharing

Shivanshu Kumar

Gopalakrishnan Srinivasan

13 Oct 2025

AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs

169

12 Oct 2025

Preserving LLM Capabilities through Calibration Data Curation: From Analysis to Optimization

112

12 Oct 2025

PermLLM: Learnable Channel Permutation for N:M Sparse Large Language Models

11 Oct 2025

RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model

133

09 Oct 2025

SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks

Md. Kowsher

Ali O. Polat

Ehsan Mohammady Ardehaly

189

09 Oct 2025

AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models

123

09 Oct 2025

Don't Run with Scissors: Pruning Breaks VLA Models but They Can Be Recovered

140

09 Oct 2025

Fewer Weights, More Problems: A Practical Attack on LLM Pruning

203

09 Oct 2025

Vanishing Contributions: A Unified Approach to Smoothly Transition Neural Models into Compressed Form

Lorenzo Nikiforos

Charalampos Antoniadis

116

09 Oct 2025

Where to Begin: Efficient Pretraining via Subnetwork Selection and Distillation

132

08 Oct 2025

OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot

289

08 Oct 2025

Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM

Ryan Solgi

Parsa Madinei

Jiayi Tian

Rupak Vignesh Swaminathan

Jing Liu

Nathan Susanj

Zheng Zhang

07 Oct 2025

Diversity Is All You Need for Contrastive Learning: Spectral Bounds on Gradient Magnitudes

Peter Ochieng

07 Oct 2025

ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization

121

07 Oct 2025

lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models

114

07 Oct 2025

Expand Neurons, Not Parameters

127

06 Oct 2025

The Curious Case of In-Training Compression of State Space Models

170

03 Oct 2025

Small is Sufficient: Reducing the World AI Energy Consumption Through Model Selection

Tiago da Silva Barros

Frédéric Giroire

Ramon Aparicio-Pardo

Joanna Moulierac

133

02 Oct 2025

The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM

101

02 Oct 2025

Accelerating Attention with Basis Decomposition

Jialin Zhao

156

02 Oct 2025

PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning

247

30 Sep 2025

CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models

108

30 Sep 2025

Collaborative Compression for Large-Scale MoE Deployment on Edge

30 Sep 2025

Layer-wise dynamic rank for compressing large language models

208

30 Sep 2025

UniPruning: Unifying Local Metric and Global Feedback for Scalable Sparse LLMs

171

29 Sep 2025

DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding

126

28 Sep 2025

A Second-Order Perspective on Pruning at Initialization and Knowledge Transfer

155

28 Sep 2025

Differentiable Sparsity via

D

-Gating: Simple and Versatile Structured Penalization

388

28 Sep 2025

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models

125

27 Sep 2025

COSPADI: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning

Stamatios Lefkimmiatis

189

26 Sep 2025

Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs

141

26 Sep 2025

$StructPrune: Structured Global Pruning asymptotics with $\mathcal{O}(\sqrt{N})$ GPU Memory$

StructPrune: Structured Global Pruning asymptotics with

\mathcal{O}(\sqrt{N})

121

25 Sep 2025

RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models

235

24 Sep 2025

NIRVANA: Structured pruning reimagined for large language models compression

1.6K

17 Sep 2025

FastMTP: Accelerating LLM Inference with Enhanced Multi-Token Prediction

166

16 Sep 2025

Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction

146

15 Sep 2025

Harnessing Optimization Dynamics for Curvature-Informed Model Merging

Pouria Mahdavinia

Hamed Mahdavi

Niloofar Mireshghallah

M. Mahdavi

MoMe

184

14 Sep 2025

Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs

Hang Guo

Yawei Li

Luca Benini

215

14 Sep 2025

GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings

Yixuan Tang

Yi Yang

128

13 Sep 2025

Unified Start, Personalized End: Progressive Pruning for Efficient 3D Medical Image Segmentation

136

11 Sep 2025

COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens

Eugene Kwek

Wenpeng Yin

VLM

266

08 Sep 2025

Delta Activations: A Representation for Finetuned Large Language Models

164

04 Sep 2025

From Construction to Injection: Edit-Based Fingerprints for Large Language Models

212

03 Sep 2025

LExI: Layer-Adaptive Active Experts for Efficient MoE Model Inference

Krishna Teja Chitty-Venkata

160

02 Sep 2025

Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance

363

29 Aug 2025

Towards On-Device Personalization: Cloud-device Collaborative Data Augmentation for Efficient On-device Language Model

136

29 Aug 2025