ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.03800
  4. Cited By
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and
  Luck

Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

7 September 2023
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
ArXivPDFHTML

Papers citing "Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck"

14 / 14 papers shown
Title
Towards a theory of model distillation
Towards a theory of model distillation
Enric Boix-Adserà
FedML
VLM
41
5
0
14 Mar 2024
Complexity Matters: Dynamics of Feature Learning in the Presence of
  Spurious Correlations
Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu
Da Kuang
Surbhi Goel
22
8
0
05 Mar 2024
TinyGSM: achieving >80% on GSM8k with small language models
TinyGSM: achieving >80% on GSM8k with small language models
Bingbin Liu
Sébastien Bubeck
Ronen Eldan
Janardhan Kulkarni
Yuanzhi Li
Anh Nguyen
Rachel A. Ward
Yi Zhang
ALM
19
47
0
14 Dec 2023
Feature emergence via margin maximization: case studies in algebraic
  tasks
Feature emergence via margin maximization: case studies in algebraic tasks
Depen Morwani
Benjamin L. Edelman
Costin-Andrei Oncescu
Rosie Zhao
Sham Kakade
21
7
0
13 Nov 2023
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
  of Policy-Gradient Methods
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods
C. Caramanis
Dimitris Fotakis
Alkis Kalavasis
Vasilis Kontonis
Christos Tzamos
8
5
0
08 Oct 2023
Understanding MLP-Mixer as a Wide and Sparse MLP
Understanding MLP-Mixer as a Wide and Sparse MLP
Tomohiro Hayase
Ryo Karakida
MoE
16
4
0
02 Jun 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle
  dynamics
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
76
72
0
21 Feb 2023
Learning Single-Index Models with Shallow Neural Networks
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
150
65
0
27 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
54
76
0
03 Oct 2022
Sparse tree-based initialization for neural networks
Sparse tree-based initialization for neural networks
P. Lutz
Ludovic Arnould
Claire Boyer
Erwan Scornet
28
2
0
30 Sep 2022
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,554
0
04 May 2021
Learning Curve Theory
Learning Curve Theory
Marcus Hutter
128
56
0
08 Feb 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
99
570
0
08 Dec 2012
1