Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.02657
Cited By
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
3 April 2024
Taiqiang Wu
Chaofan Tao
Jiahao Wang
Zhe Zhao
Ngai Wong
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models"
7 / 7 papers shown
Title
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via
α
α
α
-
β
β
β
-Divergence
Guanghui Wang
Zhiyong Yang
Z. Wang
Shi Wang
Qianqian Xu
Q. Huang
34
0
0
07 May 2025
Head-Tail-Aware KL Divergence in Knowledge Distillation for Spiking Neural Networks
Tianqing Zhang
Zixin Zhu
Kairong Yu
Hongwei Wang
52
0
0
29 Apr 2025
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
Runming Yang
Taiqiang Wu
Jiahao Wang
Pengfei Hu
Ngai Wong
Yujiu Yang
Yujiu Yang
46
0
0
11 Nov 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
58
5
0
22 Oct 2024
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
36
30
0
15 Feb 2024
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang
Yangdong Liu
Haotong Qin
Ying Li
Shiming Zhang
Xianglong Liu
Michele Magno
Xiaojuan Qi
MQ
77
63
0
06 Feb 2024
Weight-Inherited Distillation for Task-Agnostic BERT Compression
Taiqiang Wu
Cheng-An Hou
Shanshan Lao
Jiayi Li
Ngai Wong
Zhe Zhao
Yujiu Yang
57
10
0
16 May 2023
1