ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.08797
  4. Cited By
A Comparative Analysis of Task-Agnostic Distillation Methods for
  Compressing Transformer Language Models

A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models

13 October 2023
Takuma Udagawa
Aashka Trivedi
Michele Merler
Bishwaranjan Bhattacharjee
ArXivPDFHTML

Papers citing "A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models"

11 / 11 papers shown
Title
GPTA: Generative Prompt Tuning Assistant for Synergistic Downstream
  Neural Network Enhancement with LLMs
GPTA: Generative Prompt Tuning Assistant for Synergistic Downstream Neural Network Enhancement with LLMs
Xiao Liu
Jiawei Zhang
22
0
0
29 Mar 2024
Multiple Representation Transfer from Large Language Models to
  End-to-End ASR Systems
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems
Takuma Udagawa
Masayuki Suzuki
Gakuto Kurata
Masayasu Muraoka
G. Saon
19
2
0
07 Sep 2023
Neural Architecture Search for Effective Teacher-Student Knowledge
  Transfer in Language Models
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Rameswar Panda
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
14
6
0
16 Mar 2023
Multilingual Representation Distillation with Contrastive Learning
Multilingual Representation Distillation with Contrastive Learning
Weiting Tan
Kevin Heffernan
Holger Schwenk
Philipp Koehn
35
16
0
10 Oct 2022
Distilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model Compression
Geondo Park
Gyeongman Kim
Eunho Yang
40
37
0
17 Sep 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
86
332
0
05 Jan 2021
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
221
0
31 Dec 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
196
0
07 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
571
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1