Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15032
Cited By
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives
24 May 2023
Xinpeng Wang
Leonie Weissweiler
Hinrich Schütze
Barbara Plank
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives"
3 / 3 papers shown
Title
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
34
3
0
18 Feb 2024
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Rameswar Panda
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
22
6
0
16 Mar 2023
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
1