ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.15032
  4. Cited By
How to Distill your BERT: An Empirical Study on the Impact of Weight
  Initialisation and Distillation Objectives

How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives

24 May 2023
Xinpeng Wang
Leonie Weissweiler
Hinrich Schütze
Barbara Plank
ArXivPDFHTML

Papers citing "How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives"

3 / 3 papers shown
Title
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
34
3
0
18 Feb 2024
Neural Architecture Search for Effective Teacher-Student Knowledge
  Transfer in Language Models
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Rameswar Panda
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
22
6
0
16 Mar 2023
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1