ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.07891
  4. Cited By
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

11 October 2023
Behrad Moniri
Donghwan Lee
Hamed Hassani
Edgar Dobriban
    MLT
ArXivPDFHTML

Papers citing "A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks"

15 / 15 papers shown
Title
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
Samet Demir
Zafer Dogan
MLT
34
0
0
02 Mar 2025
Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization
Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization
Simone Bombari
Marco Mondelli
106
0
0
03 Feb 2025
A Random Matrix Theory Perspective on the Spectrum of Learned Features
  and Asymptotic Generalization Capabilities
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities
Yatin Dandi
Luca Pesce
Hugo Cui
Florent Krzakala
Yue M. Lu
Bruno Loureiro
MLT
35
1
0
24 Oct 2024
Robust Feature Learning for Multi-Index Models in High Dimensions
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini
Adel Javanmard
Murat A. Erdogdu
OOD
AAML
39
1
0
21 Oct 2024
Generalization for Least Squares Regression With Simple Spiked
  Covariances
Generalization for Least Squares Regression With Simple Spiked Covariances
Jiping Li
Rishi Sonthalia
21
0
0
17 Oct 2024
Random Features Outperform Linear Models: Effect of Strong Input-Label
  Correlation in Spiked Covariance Data
Random Features Outperform Linear Models: Effect of Strong Input-Label Correlation in Spiked Covariance Data
Samet Demir
Zafer Dogan
28
1
0
30 Sep 2024
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise
Vignesh Kothapalli
Tianyu Pang
Shenyang Deng
Zongmin Liu
Yaoqing Yang
26
3
0
07 Jun 2024
Online Learning and Information Exponents: On The Importance of Batch
  size, and Time/Complexity Tradeoffs
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
40
1
0
04 Jun 2024
Signal-Plus-Noise Decomposition of Nonlinear Spiked Random Matrix Models
Signal-Plus-Noise Decomposition of Nonlinear Spiked Random Matrix Models
Behrad Moniri
Hamed Hassani
21
0
0
28 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
59
11
0
24 May 2024
Asymptotics of Learning with Deep Structured (Random) Features
Asymptotics of Learning with Deep Structured (Random) Features
Dominik Schröder
Daniil Dmitriev
Hugo Cui
Bruno Loureiro
34
6
0
21 Feb 2024
Feature learning as alignment: a structural property of gradient descent
  in non-linear neural networks
Feature learning as alignment: a structural property of gradient descent in non-linear neural networks
Daniel Beaglehole
Ioannis Mitliagkas
Atish Agarwala
MLT
34
2
0
07 Feb 2024
Asymptotics of feature learning in two-layer networks after one
  gradient-step
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
44
16
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer
  Networks: Breaking the Curse of Information and Leap Exponents
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
59
24
0
05 Feb 2024
Towards Understanding the Word Sensitivity of Attention Layers: A Study
  via Random Features
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
26
3
0
05 Feb 2024
1