ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.01885
  4. Cited By
Survey on Knowledge Distillation for Large Language Models: Methods,
  Evaluation, and Application

Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application

2 July 2024
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
    ALM
    KELM
ArXivPDFHTML

Papers citing "Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application"

31 / 31 papers shown
Title
Quantitative Analysis of Performance Drop in DeepSeek Model Quantization
Quantitative Analysis of Performance Drop in DeepSeek Model Quantization
Enbo Zhao
Yi Shen
Shuming Shi
Jieyun Huang
Z. Chen
Ning Wang
Siqi Xiao
J. Zhang
Kai Wang
Shiguo Lian
MQ
29
0
0
05 May 2025
Learning and Transferring Physical Models through Derivatives
Learning and Transferring Physical Models through Derivatives
Alessandro Trenta
Andrea Cossu
Davide Bacciu
AI4CE
26
0
0
02 May 2025
Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks
Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks
Yang Janet Liu
Bingjie Yan
Tianyuan Zou
Jianqing Zhang
Zixuan Gu
...
J. Li
Xiaozhou Ye
Ye Ouyang
Qiang Yang
Y. Zhang
ALM
48
0
0
24 Apr 2025
The Rise of Small Language Models in Healthcare: A Comprehensive Survey
The Rise of Small Language Models in Healthcare: A Comprehensive Survey
Muskan Garg
Shaina Raza
Shebuti Rayana
Xingyi Liu
Sunghwan Sohn
LM&MA
AILaw
87
0
0
23 Apr 2025
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models
Junjie Yang
Junhao Song
Xudong Han
Ziqian Bi
Tianyang Wang
...
Y. Zhang
Qian Niu
Benji Peng
Keyu Chen
Ming Liu
VLM
33
0
0
18 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
21
0
0
13 Apr 2025
UNDO: Understanding Distillation as Optimization
UNDO: Understanding Distillation as Optimization
Kushal Kumar Jain
Piyushi Goyal
Kumar Shridhar
31
0
0
03 Apr 2025
Adaptive Temperature Based on Logits Correlation in Knowledge Distillation
Kazuhiro Matsuyama
Usman Anjum
Satoko Matsuyama
Tetsuo Shoda
J. Zhan
52
0
0
12 Mar 2025
Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues
Alexander Scarlatos
Naiming Liu
Jaewook Lee
Richard Baraniuk
Andrew S. Lan
39
1
0
09 Mar 2025
I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning
I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser
Nathalie Rauschmayr
Achin Kulshrestha
Petra Poklukar
Wittawat Jitkrittum
Sean Augenstein
Congchao Wang
Federico Tombari
37
0
0
26 Feb 2025
Practical Principles for AI Cost and Compute Accounting
Practical Principles for AI Cost and Compute Accounting
Stephen Casper
Luke Bailey
Tim Schreier
36
0
0
21 Feb 2025
SleepCoT: A Lightweight Personalized Sleep Health Model via
  Chain-of-Thought Distillation
SleepCoT: A Lightweight Personalized Sleep Health Model via Chain-of-Thought Distillation
Huimin Zheng
Xiaofeng Xing
Xiangmin Xu
VLM
39
1
0
22 Oct 2024
Optimizing Vision Transformers with Data-Free Knowledge Transfer
Optimizing Vision Transformers with Data-Free Knowledge Transfer
Gousia Habib
Damandeep Singh
I. Malik
Brejesh Lall
24
0
0
12 Aug 2024
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
198
283
0
03 May 2023
A Systematic Study of Knowledge Distillation for Natural Language
  Generation with Pseudo-Target Training
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
21
17
0
03 May 2023
Huatuo-26M, a Large-scale Chinese Medical QA Dataset
Huatuo-26M, a Large-scale Chinese Medical QA Dataset
Jianquan Li
Xidong Wang
Xiangbo Wu
Zhiyi Zhang
Xiaolong Xu
Jie Fu
Prayag Tiwari
Xiang Wan
Benyou Wang
LM&MA
60
40
0
02 May 2023
PMC-LLaMA: Towards Building Open-source Language Models for Medicine
PMC-LLaMA: Towards Building Open-source Language Models for Medicine
Chaoyi Wu
Weixiong Lin
Xiaoman Zhang
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
LM&MA
AI4MH
81
74
0
27 Apr 2023
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale
  Instructions
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu
Abdul Waheed
Chiyu Zhang
Muhammad Abdul-Mageed
Alham Fikri Aji
ALM
118
115
0
27 Apr 2023
ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model
  Meta-AI (LLaMA) Using Medical Domain Knowledge
ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge
Yunxiang Li
Zihan Li
Kai Zhang
Ruilong Dan
Steven Jiang
You Zhang
LM&MA
AI4MH
114
366
0
24 Mar 2023
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLM
LRM
162
320
0
06 Oct 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
DDXPlus: A New Dataset For Automatic Medical Diagnosis
DDXPlus: A New Dataset For Automatic Medical Diagnosis
Arsène Fansi Tchango
Rishab Goel
Zhi Wen
Julien Martel
J. Ghosn
99
35
0
18 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
Meta-learning via Language Model In-context Tuning
Meta-learning via Language Model In-context Tuning
Yanda Chen
Ruiqi Zhong
Sheng Zha
George Karypis
He He
210
155
0
15 Oct 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
199
167
0
18 Apr 2021
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
242
460
0
06 Jan 2021
PubMedQA: A Dataset for Biomedical Research Question Answering
PubMedQA: A Dataset for Biomedical Research Question Answering
Qiao Jin
Bhuwan Dhingra
Zhengping Liu
William W. Cohen
Xinghua Lu
196
791
0
13 Sep 2019
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
391
2,216
0
03 Sep 2019
e-SNLI: Natural Language Inference with Natural Language Explanations
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
249
618
0
04 Dec 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1