Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2002.11985
Cited By
v1
v2 (latest)
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Transactions of the Association for Computational Linguistics (TACL), 2020
27 February 2020
Prakhar Ganesh
Yao Chen
Xin Lou
Mohammad Ali Khan
Yifan Yang
Hassan Sajjad
Preslav Nakov
Deming Chen
Marianne Winslett
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Compressing Large-Scale Transformer-Based Models: A Case Study on BERT"
50 / 70 papers shown
FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation
Dong Liu
Jiayi Zhang
Jiayi Zhang
Yanxuan Yu
Ben Lengerich
Ying Nian Wu
321
16
0
30 Mar 2026
Efficient Split Learning LSTM Models for FPGA-based Edge IoT Devices
Romina Soledad Molina
Vukan Ninkovic
D. Vukobratović
Maria Liz Crespo
Marco Zennaro
235
3
0
12 Feb 2025
MAS-Attention: Memory-Aware Stream Processing for Attention Acceleration on Resource-Constrained Edge Devices
Mohammadali Shakerdargah
Shan Lu
Chao Gao
Di Niu
470
2
0
20 Nov 2024
FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs
International Conference on Field-Programmable Technology (ICFPT), 2024
Ehsan Kabir
Md. Arafat Kabir
Austin R. J. Downey
Jason D. Bakos
David Andrews
Miaoqing Huang
GNN
317
3
0
21 Sep 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
Zhan Qin
Han Wang
Zhao Yang
Zecheng Wang
...
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
396
13
0
24 Jun 2024
Large Language Model Pruning
Hanjuan Huang
Hao-Jia Song
H. Pao
515
1
0
24 May 2024
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Qinshuo Liu
Xianglong Liu
Luca Benini
Michele Magno
Shiming Zhang
Xiaojuan Qi
MQ
496
45
0
23 May 2024
Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations
Yiming Li
Xueqing Peng
Jianfu Li
X. Zuo
Suyuan Peng
Donghong Pei
Cui Tao
Hua Xu
Na Hong
LM&MA
318
28
0
08 Apr 2024
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
Jiun-Man Chen
Yu-Hsuan Chao
Yu-Jie Wang
Ming-Der Shieh
Chih-Chung Hsu
Wei-Fen Lin
MQ
313
3
0
11 Mar 2024
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
IEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Yun-Wei Chu
Dong-Jun Han
Christopher G. Brinton
369
7
0
15 Jan 2024
Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression
Luis Balderas
Miguel Lastra
José M. Benítez
141
3
0
17 Dec 2023
Few-Shot Classification & Segmentation Using Large Language Models Agent
Tian Meng
Yang Tao
Wuliang Yin
VLM
299
3
0
19 Nov 2023
EELBERT: Tiny Models through Dynamic Embeddings
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Gabrielle Cohn
Rishika Agarwal
Deepanshu Gupta
Siddharth Patwardhan
190
3
0
31 Oct 2023
Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey
ACM Transactions on Software Engineering and Methodology (TOSEM), 2023
Xinyu She
Yue Liu
Yanjie Zhao
Yiling He
Li Li
Chakkrit Tantithamthavorn
Zhan Qin
Haoyu Wang
ELM
264
22
0
27 Oct 2023
Neural Language Model Pruning for Automatic Speech Recognition
Leonardo Emili
Thiago Fraga-Silva
Ernest Pusateri
M. Nußbaum-Thom
Youssef Oualil
262
3
0
05 Oct 2023
Transformers in Healthcare: A Survey
Subhash Nerella
S. Bandyopadhyay
Jiaqing Zhang
Miguel Contreras
Scott Siegel
...
Jessica Sena
B. Shickel
A. Bihorac
Kia Khezeli
Parisa Rashidi
MedIm
AI4CE
340
108
0
30 Jun 2023
Deep Fusion: Efficient Network Training via Pre-trained Initializations
International Conference on Machine Learning (ICML), 2023
Hanna Mazzawi
X. Gonzalvo
Michael Wunder
Sammy Jerome
Benoit Dherin
AI4CE
560
4
0
20 Jun 2023
Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi
Mykola Pechenizkiy
Raymond N. J. Veldhuis
Decebal Constantin Mocanu
AI4TS
AI4CE
338
2
0
28 May 2023
Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alon Jacovi
Avi Caciularu
Omer Goldman
Yoav Goldberg
289
140
0
17 May 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices
Ahlam Husni Abu Nada
S. Latif
Junaid Qadir
187
6
0
22 Apr 2023
Classification of integers based on residue classes via modern deep learning algorithms
Patterns (Patterns), 2023
Dangwei Wu
Jing Yang
Mian Umair Ahsan
Kai Wang
328
3
0
03 Apr 2023
Gradient-Free Structured Pruning with Unlabeled Data
International Conference on Machine Learning (ICML), 2023
Azade Nova
H. Dai
Dale Schuurmans
SyDa
365
38
0
07 Mar 2023
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
398
7
0
15 Feb 2023
idT5: Indonesian Version of Multilingual T5 Transformer
Mukhlish Fuadi
A. Wibawa
S. Sumpeno
256
11
0
02 Feb 2023
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training
D. Kadiyala
Saeed Rashidi
Taekyung Heo
Abhimanyu Bambhaniya
T. Krishna
Alexandros Daglis
VLM
212
11
0
30 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
395
15
0
04 Nov 2022
Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models
Conference on Machine Translation (WMT), 2022
Harshita Diddee
Sandipan Dandapat
Monojit Choudhury
T. Ganu
Kalika Bali
258
9
0
27 Oct 2022
HashFormers: Towards Vocabulary-independent Pre-trained Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Huiyin Xue
Nikolaos Aletras
213
5
0
14 Oct 2022
MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Mohammadmahdi Nouriborji
Omid Rohanian
Samaneh Kouchaki
David Clifton
205
9
0
12 Oct 2022
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
Neural Information Processing Systems (NeurIPS), 2022
Yuanxin Liu
Fandong Meng
Zheng Lin
JiangNan Li
Peng Fu
Yanan Cao
Weiping Wang
Jie Zhou
261
7
0
11 Oct 2022
EffEval: A Comprehensive Evaluation of Efficiency for MT Evaluation Metrics
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Daniil Larionov
Jens Grunwald
Christoph Leiter
Steffen Eger
242
7
0
20 Sep 2022
Survey: Exploiting Data Redundancy for Optimization of Deep Learning
ACM Computing Surveys (ACM CSUR), 2022
Jou-An Chen
Wei Niu
Bin Ren
Yanzhi Wang
Xipeng Shen
200
36
0
29 Aug 2022
Efficient Fine-Tuning of Compressed Language Models with Learners
Danilo Vucetic
Mohammadreza Tayaranian
M. Ziaeefard
J. Clark
B. Meyer
W. Gross
182
3
0
03 Aug 2022
HiKonv: Maximizing the Throughput of Quantized Convolution With Novel Bit-wise Management and Computation
Yao Chen
Junhao Pan
Xinheng Liu
Jinjun Xiong
Deming Chen
MQ
121
0
0
22 Jul 2022
S4: a High-sparsity, High-performance AI Accelerator
Ian En-Hsu Yen
Zhibin Xiao
Dongkuan Xu
204
5
0
16 Jul 2022
Differentially Private Model Compression
Neural Information Processing Systems (NeurIPS), 2022
Fatemehsadat Mireshghallah
A. Backurs
Huseyin A. Inan
Lukas Wutschitz
Janardhan Kulkarni
SyDa
234
16
0
03 Jun 2022
Efficient Fine-Tuning of BERT Models on the Edge
International Symposium on Circuits and Systems (ISCAS), 2022
Danilo Vucetic
Mohammadreza Tayaranian
M. Ziaeefard
J. Clark
B. Meyer
W. Gross
319
45
0
03 May 2022
Structured Pruning Learns Compact and Accurate Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
433
226
0
01 Apr 2022
Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
International Symposium on Computer Architecture (ISCA), 2022
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Andreas Moshovos
MQ
266
40
0
23 Mar 2022
Unified Visual Transformer Compression
International Conference on Learning Representations (ICLR), 2022
Shixing Yu
Tianlong Chen
Jiayi Shen
Huan Yuan
Jianchao Tan
Sen Yang
Ji Liu
Zinan Lin
ViT
261
116
0
15 Mar 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
Neurocomputing (Neurocomputing), 2022
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLM
ViT
267
23
0
11 Feb 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
243
5
0
29 Jan 2022
Pretrained Language Models for Text Generation: A Survey
ACM Computing Surveys (ACM CSUR), 2022
Junyi Li
Tianyi Tang
Wayne Xin Zhao
J. Nie
Ji-Rong Wen
AI4CE
638
286
0
14 Jan 2022
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
259
7
0
03 Jan 2022
An Ensemble of Pre-trained Transformer Models For Imbalanced Multiclass Malware Classification
Computers & security (CS), 2021
Ferhat Demirkiran
Aykut Çayır
U. Ünal
Hasan Dag
421
66
0
25 Dec 2021
Benchmark Static API Call Datasets for Malware Family Classification
Berkant Düzgün
Aykut Çayır
Ferhat Demirkiran
Ceyda Nur Kahya
Buket Gençaydın
Hasan Dag
251
8
0
30 Nov 2021
Few-Shot Self-Rationalization with Natural Language Prompts
Ana Marasović
Iz Beltagy
Doug Downey
Matthew E. Peters
LRM
336
116
0
16 Nov 2021
When in Doubt, Summon the Titans: Efficient Inference with Large Models
A. S. Rawat
Manzil Zaheer
A. Menon
Amr Ahmed
Sanjiv Kumar
214
9
0
19 Oct 2021
Graph-Guided Network for Irregularly Sampled Multivariate Time Series
International Conference on Learning Representations (ICLR), 2021
Xiang Zhang
M. Zeman
Theodoros Tsiligkaridis
Marinka Zitnik
MLAU
AI4TS
383
159
0
11 Oct 2021
Cross-Modal Coherence for Text-to-Image Retrieval
AAAI Conference on Artificial Intelligence (AAAI), 2021
Malihe Alikhani
Fangda Han
Hareesh Ravi
Mubbasir Kapadia
Vladimir Pavlovic
Matthew Stone
235
10
0
22 Sep 2021
1
2
Next
Page 1 of 2