Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.01445
Cited By
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
3 May 2022
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation"
50 / 98 papers shown
Title
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Francesco Cagnetta
Alessandro Favero
Antonio Sclocchi
M. Wyart
21
0
0
11 May 2025
Survey on Algorithms for multi-index models
Joan Bruna
Daniel Hsu
18
0
0
07 Apr 2025
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Fabiola Ricci
Lorenzo Bardone
Sebastian Goldt
OOD
33
0
0
31 Mar 2025
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Chi-Ning Chou
Hang Le
Yichen Wang
SueYeon Chung
44
0
0
23 Mar 2025
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
46
0
0
20 Mar 2025
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini
Clayton Sanford
Denny Wu
Murat A. Erdogdu
43
0
0
14 Mar 2025
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
Samet Demir
Zafer Dogan
MLT
34
0
0
02 Mar 2025
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
68
8
0
17 Feb 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
54
0
0
10 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
53
1
0
09 Feb 2025
The Complexity of Learning Sparse Superposed Features with Feedback
Akash Kumar
92
0
0
08 Feb 2025
Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization
Simone Bombari
Marco Mondelli
113
0
0
03 Feb 2025
Optimal Exact Recovery in Semi-Supervised Learning: A Study of Spectral Methods and Graph Convolutional Networks
Hai-Xiao Wang
Zhichao Wang
71
1
0
18 Dec 2024
On the Efficiency of ERM in Feature Learning
Ayoub El Hanchi
Chris J. Maddison
Murat A. Erdogdu
62
0
0
18 Nov 2024
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
36
0
0
13 Nov 2024
Pretrained transformer efficiently learns low-dimensional target functions in-context
Kazusato Oko
Yujin Song
Taiji Suzuki
Denny Wu
34
4
0
04 Nov 2024
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance
Jean Barbier
Francesco Camilli
Justin Ko
Koki Okajima
21
5
0
04 Nov 2024
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
Gavia Gray
Aman Tiwari
Shane Bergsma
Joel Hestness
25
1
0
01 Nov 2024
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities
Yatin Dandi
Luca Pesce
Hugo Cui
Florent Krzakala
Yue M. Lu
Bruno Loureiro
MLT
35
1
0
24 Oct 2024
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini
Adel Javanmard
Murat A. Erdogdu
OOD
AAML
42
1
0
21 Oct 2024
Generalization for Least Squares Regression With Simple Spiked Covariances
Jiping Li
Rishi Sonthalia
23
0
0
17 Oct 2024
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Hossein Taheri
Christos Thrampoulidis
Arya Mazumdar
MLT
31
0
0
13 Oct 2024
Task Diversity Shortens the ICL Plateau
Jaeyeon Kim
Sehyun Kwon
Joo Young Choi
Jongho Park
Jaewoong Cho
Jason D. Lee
Ernest K. Ryu
MoMe
29
2
0
07 Oct 2024
Random Features Outperform Linear Models: Effect of Strong Input-Label Correlation in Spiked Covariance Data
Samet Demir
Zafer Dogan
28
1
0
30 Sep 2024
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
49
12
0
26 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
30
3
0
22 Sep 2024
Improving Adaptivity via Over-Parameterization in Sequence Models
Yicheng Li
Qian Lin
14
1
0
02 Sep 2024
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Alireza Mousavi-Hosseini
Denny Wu
Murat A. Erdogdu
MLT
AI4CE
27
6
0
14 Aug 2024
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
44
1
0
06 Aug 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
35
15
0
10 Jun 2024
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Vignesh Kothapalli
Tianyu Pang
Shenyang Deng
Zongmin Liu
Yaoqing Yang
29
3
0
07 Jun 2024
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
43
1
0
04 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
87
21
0
03 Jun 2024
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
34
3
0
29 May 2024
Signal-Plus-Noise Decomposition of Nonlinear Spiked Random Matrix Models
Behrad Moniri
Hamed Hassani
29
0
0
28 May 2024
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy
Nikola Konstantinov
32
4
0
27 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
61
11
0
24 May 2024
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis
Yufan Li
Subhabrata Sen
Ben Adlam
MLT
31
1
0
18 Apr 2024
Sliding down the stairs: how correlated latent variables accelerate learning with neural networks
Lorenzo Bardone
Sebastian Goldt
30
7
0
12 Apr 2024
Understanding the Learning Dynamics of Alignment with Human Feedback
Shawn Im
Yixuan Li
ALM
24
11
0
27 Mar 2024
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Shokichi Takakura
Taiji Suzuki
MLT
17
5
0
22 Mar 2024
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Yihang Chen
Fanghui Liu
Yiping Lu
Grigorios G. Chrysos
V. Cevher
28
2
0
14 Mar 2024
Towards a theory of model distillation
Enric Boix-Adserà
FedML
VLM
44
6
0
14 Mar 2024
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
Pierre Mergny
Justin Ko
Florent Krzakala
Lenka Zdeborová
18
1
0
07 Mar 2024
Asymptotics of Learning with Deep Structured (Random) Features
Dominik Schröder
Daniil Dmitriev
Hugo Cui
Bruno Loureiro
40
6
0
21 Feb 2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Benjamin L. Edelman
Ezra Edelman
Surbhi Goel
Eran Malach
Nikolaos Tsilivis
BDL
21
39
0
16 Feb 2024
Feature learning as alignment: a structural property of gradient descent in non-linear neural networks
Daniel Beaglehole
Ioannis Mitliagkas
Atish Agarwala
MLT
34
2
0
07 Feb 2024
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
44
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
59
25
0
05 Feb 2024
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
26
3
0
05 Feb 2024
1
2
Next