Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.08797
Cited By
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models
13 October 2023
Takuma Udagawa
Aashka Trivedi
Michele Merler
Bishwaranjan Bhattacharjee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models"
11 / 11 papers shown
Title
GPTA: Generative Prompt Tuning Assistant for Synergistic Downstream Neural Network Enhancement with LLMs
Xiao Liu
Jiawei Zhang
24
0
0
29 Mar 2024
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems
Takuma Udagawa
Masayuki Suzuki
Gakuto Kurata
Masayasu Muraoka
G. Saon
22
2
0
07 Sep 2023
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Rameswar Panda
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
14
6
0
16 Mar 2023
Multilingual Representation Distillation with Contrastive Learning
Weiting Tan
Kevin Heffernan
Holger Schwenk
Philipp Koehn
35
16
0
10 Oct 2022
Distilling Linguistic Context for Language Model Compression
Geondo Park
Gyeongman Kim
Eunho Yang
45
37
0
17 Sep 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
86
336
0
05 Jan 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
221
0
31 Dec 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
571
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
1