ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.11695
  4. Cited By
A Simple and Effective Pruning Approach for Large Language Models

A Simple and Effective Pruning Approach for Large Language Models

20 June 2023
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
ArXivPDFHTML

Papers citing "A Simple and Effective Pruning Approach for Large Language Models"

50 / 271 papers shown
Title
Skipping Computations in Multimodal LLMs
Skipping Computations in Multimodal LLMs
Mustafa Shukor
Matthieu Cord
24
2
0
12 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
35
3
0
12 Oct 2024
Compressing Large Language Models with Automated Sub-Network Search
Compressing Large Language Models with Automated Sub-Network Search
R. Sukthanker
B. Staffler
Frank Hutter
Aaron Klein
LRM
35
0
0
09 Oct 2024
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao
Hongming Zhang
Tao Ge
Siru Ouyang
Vicente Ordonez
Dong Yu
39
5
0
08 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
36
3
0
08 Oct 2024
Superficial Safety Alignment Hypothesis
Superficial Safety Alignment Hypothesis
Jianwei Li
Jung-Eun Kim
24
1
0
07 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
57
15
0
06 Oct 2024
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory
  Waveform Estimation from PPG Signals
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals
Yuyang Miao
Zehua Chen
C. Li
Danilo P. Mandic
DiffM
MedIm
28
5
0
06 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
X. Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
29
2
0
04 Oct 2024
How Much Can RAG Help the Reasoning of LLM?
How Much Can RAG Help the Reasoning of LLM?
Jingyu Liu
Jiaen Lin
Yong Liu
LRM
18
9
0
03 Oct 2024
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model
  Compression
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Jingcun Wang
Yu-Guang Chen
Ing-Chao Lin
Bing Li
Grace Li Zhang
33
4
0
02 Oct 2024
Do Influence Functions Work on Large Language Models?
Do Influence Functions Work on Large Language Models?
Zhe Li
Wei Zhao
Yige Li
Jun Sun
TDI
28
1
0
30 Sep 2024
Two Sparse Matrices are Better than One: Sparsifying Neural Networks
  with Double Sparse Factorization
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Vladimír Boža
Vladimír Macko
20
0
0
27 Sep 2024
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Gongfan Fang
Hongxu Yin
Saurav Muralidharan
Greg Heinrich
Jeff Pool
Jan Kautz
Pavlo Molchanov
Xinchao Wang
19
3
0
26 Sep 2024
Enhancing Aspect-based Sentiment Analysis in Tourism Using Large
  Language Models and Positional Information
Enhancing Aspect-based Sentiment Analysis in Tourism Using Large Language Models and Positional Information
Chun Xu
Mengmeng Wang
Yan Ren
Shaolin Zhu
13
1
0
23 Sep 2024
OStr-DARTS: Differentiable Neural Architecture Search based on Operation
  Strength
OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength
Le Yang
Ziwei Zheng
Yizeng Han
Shiji Song
Gao Huang
Fan Li
18
1
0
22 Sep 2024
On Importance of Pruning and Distillation for Efficient Low Resource NLP
On Importance of Pruning and Distillation for Efficient Low Resource NLP
Aishwarya Mirashi
Purva Lingayat
Srushti Sonavane
Tejas Padhiyar
Raviraj Joshi
Geetanjali Kale
13
1
0
21 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
V. Papyan
VLM
43
1
0
20 Sep 2024
KVPruner: Structural Pruning for Faster and Memory-Efficient Large
  Language Models
KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models
Bo Lv
Quan Zhou
Xuanang Ding
Yan Wang
Zeming Ma
VLM
16
1
0
17 Sep 2024
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Yuezhou Hu
Jun-Jie Zhu
Jianfei Chen
26
0
0
13 Sep 2024
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
Jaeseong Lee
seung-won hwang
Aurick Qiao
Daniel F Campos
Z. Yao
Yuxiong He
18
2
0
10 Sep 2024
Wavelet GPT: Wavelet Inspired Large Language Models
Wavelet GPT: Wavelet Inspired Large Language Models
Prateek Verma
AI4TS
18
0
0
04 Sep 2024
Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for
  Transformer Pretraining
Mixed Sparsity Training: Achieving 4×\times× FLOP Reduction for Transformer Pretraining
Pihe Hu
Shaolong Li
Longbo Huang
21
0
0
21 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through
  Sparse-Dense-Sparse Mechanism
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
27
0
0
20 Aug 2024
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Jian Chen
Vashisth Tiwari
Ranajoy Sadhukhan
Zhuoming Chen
Jinyuan Shi
Ian En-Hsu Yen
Ian En-Hsu Yen
Avner May
Tianqi Chen
Beidi Chen
LRM
31
22
0
20 Aug 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
68
6
0
19 Aug 2024
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large
  Language Models
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
Zhongyu Zhao
Menghang Dong
Rongyu Zhang
Wenzhao Zheng
Yunpeng Zhang
Huanrui Yang
Dalong Du
Kurt Keutzer
Shanghang Zhang
46
0
0
15 Aug 2024
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference
  Serving at Scale
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Jaehong Cho
Minsu Kim
Hyunmin Choi
Guseul Heo
Jongse Park
38
8
0
10 Aug 2024
A Convex-optimization-based Layer-wise Post-training Pruner for Large
  Language Models
A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models
Pengxiang Zhao
Hanyu Hu
Ping Li
Yi Zheng
Zhefeng Wang
Xiaoming Yuan
25
1
0
07 Aug 2024
Logistic Regression makes small LLMs strong and explainable
  "tens-of-shot" classifiers
Logistic Regression makes small LLMs strong and explainable "tens-of-shot" classifiers
Marcus Buckmann
Edward Hill
29
1
0
06 Aug 2024
Inference Optimizations for Large Language Models: Effects, Challenges,
  and Practical Considerations
Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations
Leo Donisch
Sigurd Schacht
Carsten Lanquillon
19
2
0
06 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
43
4
0
03 Aug 2024
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Jingtong Su
Mingyu Lee
SangKeun Lee
33
7
0
02 Aug 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse
  Training
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
28
5
0
30 Jul 2024
ThinK: Thinner Key Cache by Query-Driven Pruning
ThinK: Thinner Key Cache by Query-Driven Pruning
Yuhui Xu
Zhanming Jie
Hanze Dong
Lei Wang
Xudong Lu
Aojun Zhou
Amrita Saha
Caiming Xiong
Doyen Sahoo
67
14
0
30 Jul 2024
Greedy Output Approximation: Towards Efficient Structured Pruning for
  LLMs Without Retraining
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining
Jianwei Li
Yijun Dong
Qi Lei
19
5
0
26 Jul 2024
Efficient Inference of Vision Instruction-Following Models with Elastic
  Cache
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu
Benlin Liu
Jiahui Wang
Yuhao Dong
Guangyi Chen
Yongming Rao
Ranjay Krishna
Jiwen Lu
VLM
34
8
0
25 Jul 2024
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Qichen Fu
Minsik Cho
Thomas Merth
Sachin Mehta
Mohammad Rastegari
Mahyar Najibi
33
25
0
19 Jul 2024
Reconstruct the Pruned Model without Any Retraining
Reconstruct the Pruned Model without Any Retraining
Pingjie Wang
Ziqing Fan
Shengchao Hu
Zhe Chen
Yanfeng Wang
Yu Wang
28
1
0
18 Jul 2024
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
Hongrong Cheng
Miao Zhang
J. Q. Shi
41
2
0
16 Jul 2024
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from
  Low-Rank Gradients
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Ajay Jaiswal
Lu Yin
Zhenyu (Allen) Zhang
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
31
14
0
15 Jul 2024
Real-Time Anomaly Detection and Reactive Planning with Large Language
  Models
Real-Time Anomaly Detection and Reactive Planning with Large Language Models
Rohan Sinha
Amine Elhafsi
Christopher Agia
Matthew Foutter
Edward Schmerling
Marco Pavone
OffRL
LRM
35
24
0
11 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A
  Survey
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
37
13
0
10 Jul 2024
Composable Interventions for Language Models
Composable Interventions for Language Models
Arinbjorn Kolbeinsson
Kyle O'Brien
Tianjin Huang
Shanghua Gao
Shiwei Liu
...
Anurag J. Vaidya
Faisal Mahmood
Marinka Zitnik
Tianlong Chen
Thomas Hartvigsen
KELM
MU
77
5
0
09 Jul 2024
Isomorphic Pruning for Vision Models
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
34
6
0
05 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models:
  Enhancing Performance and Reducing Inference Costs
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
52
5
0
01 Jul 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
45
0
0
29 Jun 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large
  Language and Vision-Language Models
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
36
26
0
26 Jun 2024
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and
  Optimizing the Right Coordinate Blocks
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks
A. Ramesh
Vignesh Ganapathiraman
I. Laradji
Mark W. Schmidt
22
1
0
25 Jun 2024
Learning on Transformers is Provable Low-Rank and Sparse: A One-layer
  Analysis
Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis
Hongkang Li
Meng Wang
Shuai Zhang
Sijia Liu
Pin-Yu Chen
30
6
0
24 Jun 2024
Previous
123456
Next