v1v2 (latest)

Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

Workshop on Representation Learning for NLP (RepL4NLP), 2020

19 February 2020

Papers citing "Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning"

50 / 195 papers shown

CatBack: Universal Backdoor Attacks on Tabular Data via Categorical Encoding

127

08 Nov 2025

A Metamorphic Testing Perspective on Knowledge Distillation for Language Models of Code: Does the Student Deeply Mimic the Teacher?

Md. Abdul Awal

Mrigank Rochan

Chanchal K. Roy

214

07 Nov 2025

Efficient Adaptive Transformer: An Empirical Study and Reproducible Framework

Jan Miller

125

14 Oct 2025

Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning

149

10 Oct 2025

A Second-Order Perspective on Pruning at Initialization and Knowledge Transfer

170

28 Sep 2025

Assortment of Attention Heads: Accelerating Federated PEFT with Head Pruning and Strategic Client Selection

Yeshwanth Venkatesha

Souvik Kundu

Priyadarshini Panda

172

31 May 2025

Generative Artificial Intelligence for Internet of Things Computing: A Systematic Survey

Fabrizio Mangione

Claudio Savaglio

Giancarlo Fortino

266

10 Apr 2025

As easy as PIE: understanding when pruning causes language models to disagreeNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

259

27 Mar 2025

Moss: Proxy Model-based Full-Weight Aggregation in Federated Learning with Heterogeneous ModelsProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2025

497

13 Mar 2025

Signal Collapse in One-Shot Pruning: When Sparse Models Fail to Distinguish Neural Representations

Dhananjay Saikumar

Blesson Varghese

221

18 Feb 2025

On the Compression of Language Models for Code: An Empirical Study on CodeBERTIEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2024

245

18 Dec 2024

SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism

Priyansh Bhatnagar

Linfeng Wen

Mingu Kang

187

15 Nov 2024

Exploring the Benefit of Activation Sparsity in Pre-trainingInternational Conference on Machine Learning (ICML), 2024

Zhengyan Zhang

Chaojun Xiao

Qiujieli Qin

Yankai Lin

Zhiyuan Zeng

Xu Han

Zhiyuan Liu

Ruobing Xie

Maosong Sun

Jie Zhou

MoE

255

04 Oct 2024

Exploiting Student Parallelism for Efficient GPU Inference of BERT-like Models in Online Services

Jinbao Xue

Yangyu Tao

Di Wang

Kai Chen

256

22 Aug 2024

Cross-layer Attention Sharing for Pre-trained Large Language Models

...

300

04 Aug 2024

Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining

Jianwei Li

Yijun Dong

Qi Lei

378

26 Jul 2024

A Complete Survey on LLM-based AI Chatbots

Sumit Kumar Dam

Choong Seon Hong

Yu Qiao

Chaoning Zhang

300

140

17 Jun 2024

Understanding Token Probability Encoding in Output Embeddings

301

03 Jun 2024

Sparsity-Accelerated Training for Large Language Models

Hongshen Xu

Kai Yu

162

03 Jun 2024

Reward-based Input Construction for Cross-document Relation Extraction

158

31 May 2024

Switchable Decision: Dynamic Neural Generation Networks

232

07 May 2024

SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models

Samir Arora

Liangliang Wang

128

30 Apr 2024

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

291

08 Apr 2024

LayerNorm: A key component in parameter-efficient fine-tuning

Taha ValizadehAslani

Hualou Liang

271

29 Mar 2024

SEVEN: Pruning Transformer Model by Reserving SentinelsIEEE International Joint Conference on Neural Network (IJCNN), 2024

205

19 Mar 2024

MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer

Peng Ye

Tao Chen

233

05 Mar 2024

Model Compression and Efficient Inference for Large Language Models: A Survey

325

15 Feb 2024

Less is KEN: a Universal and Simple Non-Parametric Pruning Algorithm for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Michele Mastromattei

Fabio Massimo Zanzotto

VLM

256

05 Feb 2024

^3

-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks

Duoqian Miao

214

03 Feb 2024

Understanding LLMs: A Comprehensive Overview from Training to Inference

...

Tuo Zhang

Tianming Liu

470

135

04 Jan 2024

DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization

Rahul Chand

Yashoteja Prabhu

Pratyush Kumar

201

20 Dec 2023

BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual PolynomialsAAAI Conference on Artificial Intelligence (AAAI), 2023

Yequan Wang

247

14 Dec 2023

Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup

Jingdong Chen

Xiangyu Zhao

165

10 Dec 2023

The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Srinath Namburi

Makesh Narsimhan Sreedhar

Srinath Srinivasan

Frederic Sala

261

01 Dec 2023

DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

259

15 Nov 2023

EELBERT: Tiny Models through Dynamic EmbeddingsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

175

31 Oct 2023

MOSEL: Inference Serving Using Dynamic Modality SelectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

313

27 Oct 2023

Outlier Dimensions Encode Task-Specific KnowledgeConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

William Rudman

Catherine Chen

Carsten Eickhoff

346

26 Oct 2023

Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model CompressionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Dongyan Zhao

Rui Yan

228

24 Oct 2023

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

282

24 Oct 2023

Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models

Dongkuan Xu

333

19 Oct 2023

Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection

Jianwei Li

Weizhi Gao

Qi Lei

Dongkuan Xu

364

19 Oct 2023

Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head AttentionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Huiyin Xue

Nikolaos Aletras

365

11 Oct 2023

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

283

08 Oct 2023

Neural Language Model Pruning for Automatic Speech Recognition

240

05 Oct 2023

Mitigating Shortcuts in Language Models with Soft Label EncodingInternational Conference on Language Resources and Evaluation (LREC), 2023

Ninghao Liu

199

17 Sep 2023

A Survey on Model Compression for Large Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023

Jian Li

388

384

15 Aug 2023

SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models

Salman Avestimehr

191

115

12 Aug 2023

DPBERT: Efficient Inference for BERT based on Dynamic PlanningEuropean Conference on Artificial Intelligence (ECAI), 2023

Weixin Wu

H. Zhuo

115

26 Jul 2023

Learned Thresholds Token Merging and Pruning for Vision Transformers

Maxim Bonnaerens

J. Dambre

287

20 Jul 2023