ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.17312
  4. Cited By
BabyLlama-2: Ensemble-Distilled Models Consistently Outperform Teachers
  With Limited Data

BabyLlama-2: Ensemble-Distilled Models Consistently Outperform Teachers With Limited Data

25 September 2024
J. Tastet
I. Timiryasov
ArXivPDFHTML

Papers citing "BabyLlama-2: Ensemble-Distilled Models Consistently Outperform Teachers With Limited Data"

3 / 3 papers shown
Title
Pretraining Language Models for Diachronic Linguistic Change Discovery
Pretraining Language Models for Diachronic Linguistic Change Discovery
Elisabeth Fittschen
Sabrina Li
Tom Lippincott
Leshem Choshen
Craig Messner
26
0
0
07 Apr 2025
CoSMoEs: Compact Sparse Mixture of Experts
Patrick Huber
Akshat Shrivastava
Ernie Chang
Chinnadhurai Sankar
Ahmed Aly
Adithya Sagar
MoE
29
0
0
28 Feb 2025
Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on
  Developmentally Plausible Corpora
Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Michael Y. Hu
Aaron Mueller
Candace Ross
Adina Williams
Tal Linzen
Chengxu Zhuang
Ryan Cotterell
Leshem Choshen
Alex Warstadt
Ethan Gotlieb Wilcox
91
7
0
06 Dec 2024
1