ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.19319
  4. Cited By
Knowledge Distillation vs. Pretraining from Scratch under a Fixed
  (Computation) Budget

Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget

30 April 2024
Minh Duc Bui
Fabian David Schmidt
Goran Glavaš
K. Wense
ArXivPDFHTML

Papers citing "Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget"

2 / 2 papers shown
Title
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,453
0
23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1