Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.08682
Cited By
Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm
18 April 2021
Dongkuan Xu
Ian En-Hsu Yen
Jinxi Zhao
Zhibin Xiao
VLM
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm"
40 / 40 papers shown
Title
TT-MPD: Test Time Model Pruning and Distillation
Haihang Wu
Wei Wang
T. Malepathirana
Sachith Seneviratne
D. Oetomo
Saman K. Halgamuge
74
0
0
10 Dec 2024
MoE-I
2
^2
2
: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
Yuanlin Duan
Wenqi Jia
Miao Yin
Yu Cheng
Bo Yuan
MoE
71
4
0
01 Nov 2024
Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Tuowei Wang
Ruwen Fan
Minxing Huang
Zixu Hao
Kun Li
Ting Cao
Youyou Lu
Yaoxue Zhang
Ju Ren
40
2
0
25 Oct 2024
Enabling Efficient Batch Serving for LMaaS via Generation Length Prediction
Ke Cheng
Wen Hu
Zhi Wang
Peng Du
Jianguo Li
Sheng Zhang
34
10
0
07 Jun 2024
MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model
Rajat Sahay
Andreas E. Savakis
MoE
36
0
0
01 May 2024
GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism
Shuzhou Yuan
Michael Farber
27
2
0
10 Apr 2024
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
Weilin Zhao
Yuxiang Huang
Xu Han
Wang Xu
Chaojun Xiao
Xinrong Zhang
Yewei Fang
Kaihuo Zhang
Zhiyuan Liu
Maosong Sun
35
11
0
21 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
32
3
0
18 Feb 2024
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
Bowen Zhao
Hannaneh Hajishirzi
Qingqing Cao
21
17
0
22 Jan 2024
How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark
Eldar Kurtic
Torsten Hoefler
Dan Alistarh
21
3
0
21 Dec 2023
Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection
Jianwei Li
Weizhi Gao
Qi Lei
Dongkuan Xu
19
2
0
19 Oct 2023
Compressing LLMs: The Truth is Rarely Pure and Never Simple
Ajay Jaiswal
Zhe Gan
Xianzhi Du
Bowen Zhang
Zhangyang Wang
Yinfei Yang
MQ
36
45
0
02 Oct 2023
Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs
Lu Yin
Ajay Jaiswal
Shiwei Liu
Souvik Kundu
Zhangyang Wang
22
7
0
29 Sep 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
21
33
0
06 Jun 2023
LLM-Pruner: On the Structural Pruning of Large Language Models
Xinyin Ma
Gongfan Fang
Xinchao Wang
25
364
0
19 May 2023
A Survey on Approximate Edge AI for Energy Efficient Autonomous Driving Services
Dewant Katare
Diego Perino
J. Nurmi
M. Warnier
Marijn Janssen
Aaron Yi Ding
34
36
0
13 Apr 2023
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Shikhar Tuli
N. Jha
34
3
0
24 Mar 2023
Knowledge-Enhanced Semi-Supervised Federated Learning for Aggregating Heterogeneous Lightweight Clients in IoT
Jiaqi Wang
Shenglai Zeng
Zewei Long
Yaqing Wang
Houping Xiao
Fenglong Ma
11
16
0
05 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
24
29
0
03 Mar 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Chen Liang
Haoming Jiang
Zheng Li
Xianfeng Tang
Bin Yin
Tuo Zhao
VLM
16
24
0
19 Feb 2023
What Matters In The Structured Pruning of Generative Language Models?
Michael Santacroce
Zixin Wen
Yelong Shen
Yuan-Fang Li
18
32
0
07 Feb 2023
AttMEMO : Accelerating Transformers with Memoization on Big Memory Systems
Yuan Feng
Hyeran Jeon
F. Blagojevic
Cyril Guyot
Qing Li
Dong Li
GNN
19
3
0
23 Jan 2023
GMP*: Well-Tuned Gradual Magnitude Pruning Can Outperform Most BERT-Pruning Methods
Eldar Kurtic
Dan Alistarh
AI4MH
27
14
0
12 Oct 2022
Universal Prompt Tuning for Graph Neural Networks
Taoran Fang
Yunchao Zhang
Yang Yang
Chunping Wang
Lei Chen
22
45
0
30 Sep 2022
Towards Sparsification of Graph Neural Networks
Hongwu Peng
Deniz Gurevin
Shaoyi Huang
Tong Geng
Weiwen Jiang
O. Khan
Caiwen Ding
GNN
30
24
0
11 Sep 2022
S4: a High-sparsity, High-performance AI Accelerator
Ian En-Hsu Yen
Zhibin Xiao
Dongkuan Xu
17
3
0
16 Jul 2022
An Automatic and Efficient BERT Pruning for Edge AI Systems
Shaoyi Huang
Ning Liu
Yueying Liang
Hongwu Peng
Hongjia Li
Dongkuan Xu
Mimi Xie
Caiwen Ding
15
21
0
21 Jun 2022
Exploring Dimensionality Reduction Techniques in Multilingual Transformers
Álvaro Huertas-García
Alejandro Martín
Javier Huertas-Tato
David Camacho
24
7
0
18 Apr 2022
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
9
177
0
01 Apr 2022
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models
Eldar Kurtic
Daniel Fernando Campos
Tuan Nguyen
Elias Frantar
Mark Kurtz
Ben Fineran
Michael Goin
Dan Alistarh
VLM
MQ
MedIm
17
119
0
14 Mar 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
19
4
0
29 Jan 2022
From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression
Runxin Xu
Fuli Luo
Chengyu Wang
Baobao Chang
Jun Huang
Songfang Huang
Fei Huang
VLM
21
25
0
14 Dec 2021
Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding
Mengnan Du
Subhabrata Mukherjee
Yu Cheng
Milad Shokouhi
Xia Hu
Ahmed Hassan Awadallah
44
13
0
16 Oct 2021
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Shaoyi Huang
Dongkuan Xu
Ian En-Hsu Yen
Yijue Wang
Sung-En Chang
...
Shiyang Chen
Mimi Xie
Sanguthevar Rajasekaran
Hang Liu
Caiwen Ding
CLL
VLM
10
29
0
15 Oct 2021
Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Alan Ansell
E. Ponti
Anna Korhonen
Ivan Vulić
CLL
MoE
20
132
0
14 Oct 2021
Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation
Samuel Cahyawijaya
26
12
0
24 Aug 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
148
376
0
23 Jul 2020
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Alex Renda
Jonathan Frankle
Michael Carbin
222
382
0
05 Mar 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1