Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.01644
Cited By
Should Under-parameterized Student Networks Copy or Average Teacher Weights?
3 November 2023
Berfin Simsek
Amire Bendjeddou
W. Gerstner
Johanni Brea
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Should Under-parameterized Student Networks Copy or Average Teacher Weights?"
5 / 5 papers shown
Title
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
32
0
0
13 Nov 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
30
0
0
08 Feb 2024
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
155
65
0
27 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
313
48
0
29 Sep 2022
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
120
314
0
21 Sep 2022
1