ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18239
  4. Cited By
A Study on Knowledge Distillation from Weak Teacher for Scaling Up
  Pre-trained Language Models

A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models

Annual Meeting of the Association for Computational Linguistics (ACL), 2023
26 May 2023
Hayeon Lee
Rui Hou
Jongpil Kim
Davis Liang
Sung Ju Hwang
Alexander Min
ArXiv (abs)PDFHTMLGithub (159253★)

Papers citing "A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models"

6 / 6 papers shown
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning
  Small Language Models
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning Small Language Models
Y. Fu
Yin Yu
Xiaotian Han
Runchao Li
Xianxuan Long
Haotian Yu
Pan Li
SyDa
426
0
0
25 Nov 2024
Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning
Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning
Ruimeng Ye
Yang Xiao
Bo Hui
ALMELMOffRL
369
6
0
16 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model EnsemblingInternational Conference on Learning Representations (ICLR), 2024
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
280
21
0
03 Oct 2024
Survey on Knowledge Distillation for Large Language Models: Methods,
  Evaluation, and Application
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
ALMKELM
307
99
0
02 Jul 2024
Step Out and Seek Around: On Warm-Start Training with Incremental Data
Step Out and Seek Around: On Warm-Start Training with Incremental Data
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jose M. Alvarez
CLL
337
3
0
06 Jun 2024
Co-training and Co-distillation for Quality Improvement and Compression
  of Language Models
Co-training and Co-distillation for Quality Improvement and Compression of Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hayeon Lee
Rui Hou
Jongpil Kim
Davis Liang
Hongbo Zhang
Sung Ju Hwang
Alexander Min
430
2
0
06 Nov 2023
1
Page 1 of 1