ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.01644
  4. Cited By
Should Under-parameterized Student Networks Copy or Average Teacher
  Weights?

Should Under-parameterized Student Networks Copy or Average Teacher Weights?

3 November 2023
Berfin Simsek
Amire Bendjeddou
W. Gerstner
Johanni Brea
ArXivPDFHTML

Papers citing "Should Under-parameterized Student Networks Copy or Average Teacher Weights?"

5 / 5 papers shown
Title
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
32
0
0
13 Nov 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
30
0
0
08 Feb 2024
Learning Single-Index Models with Shallow Neural Networks
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
155
65
0
27 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with
  SGD
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
313
48
0
29 Sep 2022
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
120
314
0
21 Sep 2022
1