ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00774
  4. Cited By
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
v1v2v3 (latest)

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

International Conference on Machine Learning (ICML), 2023
2 January 2023
Elias Frantar
Dan Alistarh
    VLM
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)Github (799★)

Papers citing "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"

50 / 665 papers shown
DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
Weilin Cai
Le Qin
Shwai He
Junwei Cui
Ang Li
Jiayi Huang
MoE
120
0
0
25 Aug 2025
Unraveling the cognitive patterns of Large Language Models through module communities
Unraveling the cognitive patterns of Large Language Models through module communities
Kushal Raj Bhandari
Pin-Yu Chen
Jianxi Gao
100
0
0
25 Aug 2025
Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
Less Is More? Examining Fairness in Pruned Large Language Models for Summarising Opinions
Nannan Huang
Haytham M. Fayek
Xiuzhen Zhang
267
0
0
25 Aug 2025
One VLM, Two Roles: Stage-Wise Routing and Specialty-Level Deployment for Clinical Workflows
One VLM, Two Roles: Stage-Wise Routing and Specialty-Level Deployment for Clinical Workflows
Shayan Vassef
Soorya Ram Shimegekar
Abhay Goyal
Koustuv Saha
Pi Zonooz
Navin Kumar
273
0
0
22 Aug 2025
GM-Skip: Metric-Guided Transformer Block Skipping for Efficient Vision-Language Models
GM-Skip: Metric-Guided Transformer Block Skipping for Efficient Vision-Language Models
Lianming Huang
Haibo Hu
Qiao Li
Xin He
Nan Guan
Chun Jason Xue
VLM
126
0
0
20 Aug 2025
Z-Pruner: Post-Training Pruning of Large Language Models for Efficiency without Retraining
Z-Pruner: Post-Training Pruning of Large Language Models for Efficiency without Retraining
Samiul Basir Bhuiyan
Md. Sazzad Hossain Adib
Mohammed Aman Bhuiyan
Muhammad Rafsan Kabir
Moshiur Farazi
Shafin Rahman
Nabeel Mohammed
176
1
0
18 Aug 2025
SparseMap: A Sparse Tensor Accelerator Framework Based on Evolution Strategy
SparseMap: A Sparse Tensor Accelerator Framework Based on Evolution StrategyIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2025
Boran Zhao
Haiming Zhai
Zihang Yuan
Hetian Liu
Tian Xia
Wenzhe zhao
Pengju Ren
74
1
0
18 Aug 2025
LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
Chengtao Lv
Bilang Zhang
Yang Yong
Yazhe Niu
Yushi Huang
Shiqiao Gu
Jiajun Wu
Yumeng Shi
Jinyang Guo
Wenya Wang
MLLMVLM
158
0
0
13 Aug 2025
EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models
EGGS-PTP: An Expander-Graph Guided Structured Post-training Pruning Method for Large Language Models
Omar Bazarbachi
Zijun Sun
Yanning Shen
113
0
0
13 Aug 2025
READER: Retrieval-Assisted Drafter for Efficient LLM Inference
READER: Retrieval-Assisted Drafter for Efficient LLM Inference
Maxim Divilkovskiy
Vitaly Malygin
Sergey Zlobin
Sultan Isali
Vasily Kalugin
Stanislav Ilyushin
Nuriza Aitassova
Yi Fei
Zeng Weidi
RALM
160
0
0
12 Aug 2025
P/D-Device: Disaggregated Large Language Model between Cloud and Devices
P/D-Device: Disaggregated Large Language Model between Cloud and Devices
Yibo Jin
Yixu Xu
Yue-ting Chen
C. Wang
Tao Wang
...
Zhe Wang
Hefei Guo
Hongjie Liu
Wei Lu
Zhengyong Zhang
214
0
0
12 Aug 2025
Pushing the Envelope of LLM Inference on AI-PC
Pushing the Envelope of LLM Inference on AI-PC
E. Georganas
Dhiraj D. Kalamkar
Alexander Heinecke
MQ
116
0
0
08 Aug 2025
Deep Language Geometry: Constructing a Metric Space from LLM Weights
Deep Language Geometry: Constructing a Metric Space from LLM Weights
Maksym Shamrai
Vladyslav Hamolia
86
0
0
08 Aug 2025
Pruning Large Language Models by Identifying and Preserving Functional Networks
Pruning Large Language Models by Identifying and Preserving Functional Networks
Yiheng Liu
Junhao Ning
Sichen Xia
Xiaohui Gao
Ning Qiang
Bao Ge
Junwei Han
Xiaoyan Cai
155
1
0
07 Aug 2025
Provable Post-Training Quantization: Theoretical Analysis of OPTQ and Qronos
Provable Post-Training Quantization: Theoretical Analysis of OPTQ and Qronos
Haoyu Zhang
Shihao Zhang
Ian Colbert
Rayan Saab
MQ
198
3
0
06 Aug 2025
LeanK: Learnable K Cache Channel Pruning for Efficient Decoding
LeanK: Learnable K Cache Channel Pruning for Efficient Decoding
Y. Zhang
Zhiyuan He
Huiqiang Jiang
Chengruidong Zhang
Yuqing Yang
Jianyong Wang
Lili Qiu
121
1
0
04 Aug 2025
Amber Pruner: Leveraging N:M Activation Sparsity for Efficient Prefill in Large Language Models
Amber Pruner: Leveraging N:M Activation Sparsity for Efficient Prefill in Large Language Models
Tai An
Ruwu Cai
Yanzhe Zhang
Yang Liu
Hao Chen
Pengcheng Xie
Sheng Chang
Jing Lin
Gongyi Wang
MoE
146
2
0
04 Aug 2025
CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis
CAMERA: Multi-Matrix Joint Compression for MoE Models via Micro-Expert Redundancy Analysis
Yuzhuang Xu
Xu Han
Yuanchi Zhang
Yixuan Wang
Yijun Liu
Shiyu Ji
Qingfu Zhu
Wanxiang Che
MoEMQ
406
1
0
04 Aug 2025
XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding
XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding
Dian Chen
Yansong Qu
Xinyang Li
Ming Li
Shengchuan Zhang
221
2
0
31 Jul 2025
Unveiling Super Experts in Mixture-of-Experts Large Language Models
Unveiling Super Experts in Mixture-of-Experts Large Language Models
Zunhai Su
Qingyuan Li
Hao Zhang
Weihao Ye
Qibo Xue
YuLei Qian
Yuchen Xie
Ngai Wong
Kehong Yuan
MoE
272
2
0
31 Jul 2025
Detection Transformers Under the Knife: A Neuroscience-Inspired Approach to Ablations
Detection Transformers Under the Knife: A Neuroscience-Inspired Approach to Ablations
Nils Hütten
Florian Hölken
Hasan Tercan
Tobias Meisen
MedIm
172
0
0
29 Jul 2025
LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning
LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning
Yining Huang
Bin Li
Keke Tang
Meilian Chen
MoELRM
255
2
0
28 Jul 2025
Enhancing Large Multimodal Models with Adaptive Sparsity and KV Cache Compression
Enhancing Large Multimodal Models with Adaptive Sparsity and KV Cache Compression
Te Zhang
Yuheng Li
Junxiang Wang
Lujun Li
141
0
0
28 Jul 2025
Squeeze10-LLM: Squeezing LLMs' Weights by 10 Times via a Staged Mixed-Precision Quantization Method
Squeeze10-LLM: Squeezing LLMs' Weights by 10 Times via a Staged Mixed-Precision Quantization Method
Qingcheng Zhu
Yangyang Ren
L. Yang
Mingbao Lin
Yanjing Li
...
Haodong Zhu
Yuguang Yang
Juan Zhang
Runqi Wang
Baochang Zhang
MQ
161
0
0
24 Jul 2025
DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning
DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning
Yujia Tong
Jingling Yuan
Tian Zhang
Jianquan Liu
Chuang Hu
MQ
206
1
0
19 Jul 2025
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity
BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity
Chenyang Song
Weilin Zhao
Xu Han
Chaojun Xiao
Yingfa Chen
Yuxuan Li
Zhiyuan Liu
Maosong Sun
MoE
260
0
0
11 Jul 2025
BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers
BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers
Patrik Okanovic
Sameer Deshmukh
Grzegorz Kwa'sniewski
Yi Zhu
Haruto Fujii
...
Maciej Besta
Kentaro Katayama
Takumi Honda
Yusuke Nagasaka
Torsten Hoefler
203
0
0
03 Jul 2025
DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs
DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs
Ruokai Yin
Yuhang Li
Donghyun Lee
Priyadarshini Panda
VLM
240
2
0
25 Jun 2025
Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps
Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding HelpsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Jiashun Cheng
Chenyi Zi
Polydoros Giannouris
Ziqi Gao
Yuhan Li
Jia Li
Fugee Tsung
220
0
0
20 Jun 2025
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
Samir Khaki
Xiuyu Li
Junxian Guo
Ligeng Zhu
Chenfeng Xu
Konstantinos N. Plataniotis
Amir Yazdanbakhsh
Kurt Keutzer
Song Han
Zhijian Liu
214
4
0
19 Jun 2025
MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models
MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models
Yan Sun
Qixin Zhang
Zhiyuan Yu
Xikun Zhang
Li Shen
Dacheng Tao
199
1
0
15 Jun 2025
Training-free LLM Merging for Multi-task Learning
Training-free LLM Merging for Multi-task LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zichuan Fu
Xian Wu
Y. X. R. Wang
Wanyu Wang
Shanshan Ye
Hongzhi Yin
Yi-Ju Chang
Yefeng Zheng
Xiangyu Zhao
MoMe
188
2
0
14 Jun 2025
Compression Aware Certified Training
Compression Aware Certified Training
Changming Xu
Gagandeep Singh
165
0
0
13 Jun 2025
SlotPi: Physics-informed Object-centric Reasoning Models
SlotPi: Physics-informed Object-centric Reasoning ModelsKnowledge Discovery and Data Mining (KDD), 2025
Jian Li
Wan Han
Ning Lin
Yu-Liang Zhan
Ruizhi Chengze
...
Yi-Feng Zhang
Hongsheng Liu
Zidong Wang
Fan Yu
Hao Sun
OCLLRMAI4CE
386
0
0
12 Jun 2025
On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention
On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention
Yeonju Ro
Zhenyu Zhang
Souvik Kundu
Zhangyang Wang
Aditya Akella
407
2
0
11 Jun 2025
Olica: Efficient Structured Pruning of Large Language Models without Retraining
Jiujun He
Huazhen Lin
174
1
0
10 Jun 2025
Beyond Bias Scores: Unmasking Vacuous Neutrality in Small Language Models
Sumanth Manduru
Carlotta Domeniconi
ALM
246
0
0
10 Jun 2025
SAFE: Finding Sparse and Flat Minima to Improve Pruning
SAFE: Finding Sparse and Flat Minima to Improve Pruning
Dongyeop Lee
Kwanhee Lee
Jinseok Chung
Namhoon Lee
325
4
0
07 Jun 2025
Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias
Yuanzhe Hu
Kinshuk Goel
Vlad Killiakov
Yaoqing Yang
310
3
0
06 Jun 2025
BAQ: Efficient Bit Allocation Quantization for Large Language Models
BAQ: Efficient Bit Allocation Quantization for Large Language Models
Chao Zhang
Li Wang
S. Lasaulce
Mérouane Debbah
MQ
215
0
0
06 Jun 2025
Kinetics: Rethinking Test-Time Scaling Laws
Kinetics: Rethinking Test-Time Scaling Laws
Ranajoy Sadhukhan
Zhuoming Chen
Haizhong Zheng
Yang Zhou
Emma Strubell
Beidi Chen
457
6
0
05 Jun 2025
AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models
AhaKV: Adaptive Holistic Attention-Driven KV Cache Eviction for Efficient Inference of Large Language Models
Yifeng Gu
Zicong Jiang
Jianxiu Jin
K. Guo
Ziyang Zhang
Xiangmin Xu
247
0
0
04 Jun 2025
Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability Information
Accurate Sublayer Pruning for Large Language Models by Exploiting Latency and Tunability InformationInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Seungcheol Park
Sojin Lee
Jongjin Kim
Jinsik Lee
Hyunjik Jo
U. Kang
270
3
0
04 Jun 2025
SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling
SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling
Anhao Zhao
Fanghua Ye
Yingqi Fan
Junlong Tong
Zhiwei Fei
Hui Su
Xiaoyu Shen
249
3
0
04 Jun 2025
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Seungcheol Park
Jeongin Bae
Beomseok Kwon
Minjun Kim
Byeongwook Kim
S. Kwon
U. Kang
Dongsoo Lee
MQ
377
0
0
04 Jun 2025
MANBench: Is Your Multimodal Model Smarter than Human?
MANBench: Is Your Multimodal Model Smarter than Human?Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Han Zhou
Qitong Xu
Yiheng Dong
Xin Yang
224
0
0
04 Jun 2025
QA-HFL: Quality-Aware Hierarchical Federated Learning for Resource-Constrained Mobile Devices with Heterogeneous Image Quality
QA-HFL: Quality-Aware Hierarchical Federated Learning for Resource-Constrained Mobile Devices with Heterogeneous Image Quality
Sajid Hussain
Muhammad Sohail
Nauman Ali Khan
378
3
0
04 Jun 2025
UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
Jigang Fan
Quanlin Wu
Shengjie Luo
Liwei Wang
197
0
0
03 Jun 2025
FLoE: Fisher-Based Layer Selection for Efficient Sparse Adaptation of Low-Rank Experts
FLoE: Fisher-Based Layer Selection for Efficient Sparse Adaptation of Low-Rank Experts
Xinyi Wang
Lirong Gao
Haobo Wang
Yiming Zhang
Junbo Zhao
MoE
203
0
0
31 May 2025
EffiVLM-BENCH: A Comprehensive Benchmark for Evaluating Training-Free Acceleration in Large Vision-Language Models
EffiVLM-BENCH: A Comprehensive Benchmark for Evaluating Training-Free Acceleration in Large Vision-Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Zekun Wang
Minghua Ma
Zexin Wang
Rongchuan Mu
Liping Shan
Ming Liu
Bing Qin
VLM
200
4
0
31 May 2025
Previous
123456...121314
Next