ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01445
  4. Cited By
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation

High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation

3 May 2022
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
    MLT
ArXivPDFHTML

Papers citing "High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation"

50 / 98 papers shown
Title
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Francesco Cagnetta
Alessandro Favero
Antonio Sclocchi
M. Wyart
21
0
0
11 May 2025
Survey on Algorithms for multi-index models
Survey on Algorithms for multi-index models
Joan Bruna
Daniel Hsu
18
0
0
07 Apr 2025
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Fabiola Ricci
Lorenzo Bardone
Sebastian Goldt
OOD
33
0
0
31 Mar 2025
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Chi-Ning Chou
Hang Le
Yichen Wang
SueYeon Chung
44
0
0
23 Mar 2025
On the Cone Effect in the Learning Dynamics
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
46
0
0
20 Mar 2025
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini
Clayton Sanford
Denny Wu
Murat A. Erdogdu
43
0
0
14 Mar 2025
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
Samet Demir
Zafer Dogan
MLT
34
0
0
02 Mar 2025
A distributional simplicity bias in the learning dynamics of transformers
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
68
8
0
17 Feb 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
54
0
0
10 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
53
1
0
09 Feb 2025
The Complexity of Learning Sparse Superposed Features with Feedback
The Complexity of Learning Sparse Superposed Features with Feedback
Akash Kumar
92
0
0
08 Feb 2025
Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization
Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization
Simone Bombari
Marco Mondelli
113
0
0
03 Feb 2025
Optimal Exact Recovery in Semi-Supervised Learning: A Study of Spectral
  Methods and Graph Convolutional Networks
Optimal Exact Recovery in Semi-Supervised Learning: A Study of Spectral Methods and Graph Convolutional Networks
Hai-Xiao Wang
Zhichao Wang
71
1
0
18 Dec 2024
On the Efficiency of ERM in Feature Learning
On the Efficiency of ERM in Feature Learning
Ayoub El Hanchi
Chris J. Maddison
Murat A. Erdogdu
62
0
0
18 Nov 2024
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
36
0
0
13 Nov 2024
Pretrained transformer efficiently learns low-dimensional target
  functions in-context
Pretrained transformer efficiently learns low-dimensional target functions in-context
Kazusato Oko
Yujin Song
Taiji Suzuki
Denny Wu
34
4
0
04 Nov 2024
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance
Jean Barbier
Francesco Camilli
Justin Ko
Koki Okajima
21
5
0
04 Nov 2024
Normalization Layer Per-Example Gradients are Sufficient to Predict
  Gradient Noise Scale in Transformers
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
Gavia Gray
Aman Tiwari
Shane Bergsma
Joel Hestness
25
1
0
01 Nov 2024
A Random Matrix Theory Perspective on the Spectrum of Learned Features
  and Asymptotic Generalization Capabilities
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities
Yatin Dandi
Luca Pesce
Hugo Cui
Florent Krzakala
Yue M. Lu
Bruno Loureiro
MLT
35
1
0
24 Oct 2024
Robust Feature Learning for Multi-Index Models in High Dimensions
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini
Adel Javanmard
Murat A. Erdogdu
OOD
AAML
42
1
0
21 Oct 2024
Generalization for Least Squares Regression With Simple Spiked
  Covariances
Generalization for Least Squares Regression With Simple Spiked Covariances
Jiping Li
Rishi Sonthalia
23
0
0
17 Oct 2024
Sharper Guarantees for Learning Neural Network Classifiers with Gradient
  Methods
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Hossein Taheri
Christos Thrampoulidis
Arya Mazumdar
MLT
31
0
0
13 Oct 2024
Task Diversity Shortens the ICL Plateau
Task Diversity Shortens the ICL Plateau
Jaeyeon Kim
Sehyun Kwon
Joo Young Choi
Jongho Park
Jaewoong Cho
Jason D. Lee
Ernest K. Ryu
MoMe
29
2
0
07 Oct 2024
Random Features Outperform Linear Models: Effect of Strong Input-Label
  Correlation in Spiked Covariance Data
Random Features Outperform Linear Models: Effect of Strong Input-Label Correlation in Spiked Covariance Data
Samet Demir
Zafer Dogan
28
1
0
30 Sep 2024
How Feature Learning Can Improve Neural Scaling Laws
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
49
12
0
26 Sep 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
30
3
0
22 Sep 2024
Improving Adaptivity via Over-Parameterization in Sequence Models
Improving Adaptivity via Over-Parameterization in Sequence Models
Yicheng Li
Qian Lin
14
1
0
02 Sep 2024
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Alireza Mousavi-Hosseini
Denny Wu
Murat A. Erdogdu
MLT
AI4CE
27
6
0
14 Aug 2024
On the Generalization of Preference Learning with DPO
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
44
1
0
06 Aug 2024
Get rich quick: exact solutions reveal how unbalanced initializations
  promote rapid feature learning
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
35
15
0
10 Jun 2024
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Vignesh Kothapalli
Tianyu Pang
Shenyang Deng
Zongmin Liu
Yaoqing Yang
29
3
0
07 Jun 2024
Online Learning and Information Exponents: On The Importance of Batch
  size, and Time/Complexity Tradeoffs
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
43
1
0
04 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the
  information-theoretic limit
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
87
21
0
03 Jun 2024
Understanding and Minimising Outlier Features in Neural Network Training
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
34
3
0
29 May 2024
Signal-Plus-Noise Decomposition of Nonlinear Spiked Random Matrix Models
Signal-Plus-Noise Decomposition of Nonlinear Spiked Random Matrix Models
Behrad Moniri
Hamed Hassani
29
0
0
28 May 2024
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy
Nikola Konstantinov
32
4
0
27 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
61
11
0
24 May 2024
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis
Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis
Yufan Li
Subhabrata Sen
Ben Adlam
MLT
31
1
0
18 Apr 2024
Sliding down the stairs: how correlated latent variables accelerate
  learning with neural networks
Sliding down the stairs: how correlated latent variables accelerate learning with neural networks
Lorenzo Bardone
Sebastian Goldt
30
7
0
12 Apr 2024
Understanding the Learning Dynamics of Alignment with Human Feedback
Understanding the Learning Dynamics of Alignment with Human Feedback
Shawn Im
Yixuan Li
ALM
24
11
0
27 Mar 2024
Mean-field Analysis on Two-layer Neural Networks from a Kernel
  Perspective
Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
Shokichi Takakura
Taiji Suzuki
MLT
17
5
0
22 Mar 2024
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Yihang Chen
Fanghui Liu
Yiping Lu
Grigorios G. Chrysos
V. Cevher
28
2
0
14 Mar 2024
Towards a theory of model distillation
Towards a theory of model distillation
Enric Boix-Adserà
FedML
VLM
44
6
0
14 Mar 2024
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
Pierre Mergny
Justin Ko
Florent Krzakala
Lenka Zdeborová
18
1
0
07 Mar 2024
Asymptotics of Learning with Deep Structured (Random) Features
Asymptotics of Learning with Deep Structured (Random) Features
Dominik Schröder
Daniil Dmitriev
Hugo Cui
Bruno Loureiro
40
6
0
21 Feb 2024
The Evolution of Statistical Induction Heads: In-Context Learning Markov
  Chains
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Benjamin L. Edelman
Ezra Edelman
Surbhi Goel
Eran Malach
Nikolaos Tsilivis
BDL
21
39
0
16 Feb 2024
Feature learning as alignment: a structural property of gradient descent
  in non-linear neural networks
Feature learning as alignment: a structural property of gradient descent in non-linear neural networks
Daniel Beaglehole
Ioannis Mitliagkas
Atish Agarwala
MLT
34
2
0
07 Feb 2024
Asymptotics of feature learning in two-layer networks after one
  gradient-step
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
44
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer
  Networks: Breaking the Curse of Information and Leap Exponents
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
59
25
0
05 Feb 2024
Towards Understanding the Word Sensitivity of Attention Layers: A Study
  via Random Features
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
26
3
0
05 Feb 2024
12
Next