Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.12594
Cited By
Layer-wise Model Pruning based on Mutual Information
28 August 2021
Chun Fan
Jiwei Li
Xiang Ao
Fei Wu
Yuxian Meng
Xiaofei Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Layer-wise Model Pruning based on Mutual Information"
16 / 16 papers shown
Title
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQ
VLM
63
0
0
08 May 2025
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Vishnu Asutosh Dasu
Md. Rafi Ur Rashid
Vipul Gupta
Saeid Tizpaz-Niari
Gang Tan
AAML
43
0
0
20 Mar 2025
Dynamic Low-Rank Sparse Adaptation for Large Language Models
Weizhong Huang
Yuxin Zhang
Xiawu Zheng
Y. Liu
Jing Lin
Yiwu Yao
Rongrong Ji
85
1
0
21 Feb 2025
How Redundant Is the Transformer Stack in Speech Representation Models?
Teresa Dorszewski
Albert Kjøller Jacobsen
Lenka Tětková
Lars Kai Hansen
107
0
0
20 Jan 2025
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models
Kai Yao
P. Gao
Lichun Li
Yuan Zhao
Xiaofeng Wang
W. Wang
Jianke Zhu
24
1
0
15 Oct 2024
Persistent Topological Features in Large Language Models
Yuri Gardinazzi
Giada Panerai
Karthik Viswanathan
A. Ansuini
Alberto Cazzaniga
Matteo Biagetti
39
2
0
14 Oct 2024
MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning
Seungbeom Hu
ChanJun Park
Andrew Ferraiuolo
Sang-Ki Ko
Jinwoo Kim
Haein Song
Jieung Kim
16
1
0
24 Aug 2024
The Remarkable Robustness of LLMs: Stages of Inference?
Vedang Lad
Wes Gurnee
Max Tegmark
33
33
0
27 Jun 2024
Large Language Model Pruning
Hanjuan Huang
Hao-Jia Song
H. Pao
33
0
0
24 May 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
41
79
0
26 Mar 2024
Fairness-Aware Structured Pruning in Transformers
A. Zayed
Gonçalo Mordido
Samira Shabanian
Ioana Baldini
Sarath Chandar
27
15
0
24 Dec 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Pruning Pretrained Encoders with a Multitask Objective
Patrick Xia
Richard Shin
39
0
0
10 Dec 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
148
345
0
23 Jul 2020
Description Based Text Classification with Reinforcement Learning
Duo Chai
Wei Yu Wu
Qinghong Han
Fei Wu
Jiwei Li
VLM
106
66
0
08 Feb 2020
Revisiting Self-Training for Neural Sequence Generation
Junxian He
Jiatao Gu
Jiajun Shen
MarcÁurelio Ranzato
SSL
LRM
242
269
0
30 Sep 2019
1