Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.17581
Cited By
Knowledge Distillation Performs Partial Variance Reduction
27 May 2023
M. Safaryan
Alexandra Peste
Dan Alistarh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Knowledge Distillation Performs Partial Variance Reduction"
8 / 8 papers shown
Title
Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting
Seongmin Kim
Kwanho Kim
Minseung Kim
Kanghyun Jo
19
0
0
06 May 2025
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models
Junjie Yang
Junhao Song
Xudong Han
Ziqian Bi
Tianyang Wang
...
Y. Zhang
Qian Niu
Benji Peng
Keyu Chen
Ming Liu
VLM
47
0
0
18 Apr 2025
Improving self-training under distribution shifts via anchored confidence with theoretical guarantees
Taejong Joo
Diego Klabjan
UQCV
49
0
0
01 Nov 2024
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
A. S. Rawat
Veeranjaneyulu Sadhanala
Afshin Rostamizadeh
Ayan Chakrabarti
Wittawat Jitkrittum
...
Rakesh Shivanna
Sashank J. Reddi
A. Menon
Rohan Anil
Sanjiv Kumar
28
2
0
24 Oct 2024
Provable Weak-to-Strong Generalization via Benign Overfitting
David X. Wu
A. Sahai
65
6
0
06 Oct 2024
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
130
1,198
0
16 Aug 2016
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,194
0
01 Sep 2014
A Proximal Stochastic Gradient Method with Progressive Variance Reduction
Lin Xiao
Tong Zhang
ODL
81
736
0
19 Mar 2014
1