ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.05205
  4. Cited By
Learning a Single Neuron with Gradient Methods
v1v2v3 (latest)

Learning a Single Neuron with Gradient Methods

Annual Conference Computational Learning Theory (COLT), 2020
15 January 2020
Gilad Yehudai
Ohad Shamir
    MLT
ArXiv (abs)PDFHTML

Papers citing "Learning a Single Neuron with Gradient Methods"

48 / 48 papers shown
Gradient descent for deep equilibrium single-index models
Gradient descent for deep equilibrium single-index models
Sanjit Dandapanthula
Aaditya Ramdas
230
0
0
21 Nov 2025
Block Coordinate Descent for Neural Networks Provably Finds Global Minima
Block Coordinate Descent for Neural Networks Provably Finds Global Minima
Shunta Akiyama
170
2
0
26 Oct 2025
A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond
A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond
Nikos Tsikouras
Yorgos Pantis
Ioannis Mitliagkas
Christos Tzamos
BDL
210
0
0
22 Oct 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Low-dimensional Functions are Efficiently Learnable under Randomly Biased DistributionsAnnual Conference Computational Learning Theory (COLT), 2025
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
447
6
0
10 Feb 2025
Learning a Single Neuron Robustly to Distributional Shifts and
  Adversarial Label Noise
Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label NoiseNeural Information Processing Systems (NeurIPS), 2024
Shuyao Li
Sushrut Karmalkar
Ilias Diakonikolas
Jelena Diakonikolas
OOD
274
4
0
11 Nov 2024
Online Non-Stationary Stochastic Quasar-Convex Optimization
Online Non-Stationary Stochastic Quasar-Convex Optimization
Yuen-Man Pun
Iman Shames
220
2
0
04 Jul 2024
Disentangle Sample Size and Initialization Effect on Perfect
  Generalization for Single-Neuron Target
Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target
Jiajie Zhao
Zhiwei Bai
Yaoyu Zhang
342
1
0
22 May 2024
Bayesian Inference for Consistent Predictions in Overparameterized
  Nonlinear Regression
Bayesian Inference for Consistent Predictions in Overparameterized Nonlinear Regression
Tomoya Wakayama
BDL
365
0
0
06 Apr 2024
Masks, Signs, And Learning Rate Rewinding
Masks, Signs, And Learning Rate Rewinding
Advait Gadhikar
R. Burkholz
281
15
0
29 Feb 2024
RedEx: Beyond Fixed Representation Methods via Convex Optimization
RedEx: Beyond Fixed Representation Methods via Convex OptimizationInternational Conference on Algorithmic Learning Theory (ALT), 2024
Amit Daniely
Mariano Schain
Gilad Yehudai
245
1
0
15 Jan 2024
The Local Landscape of Phase Retrieval Under Limited Samples
The Local Landscape of Phase Retrieval Under Limited SamplesIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Kaizhao Liu
Zihao Wang
Lei Wu
297
3
0
26 Nov 2023
Should Under-parameterized Student Networks Copy or Average Teacher
  Weights?
Should Under-parameterized Student Networks Copy or Average Teacher Weights?Neural Information Processing Systems (NeurIPS), 2023
Berfin Simsek
Amire Bendjeddou
W. Gerstner
Johanni Brea
362
10
0
03 Nov 2023
Symmetric Single Index Learning
Symmetric Single Index LearningInternational Conference on Learning Representations (ICLR), 2023
Aaron Zweig
Joan Bruna
MLT
272
4
0
03 Oct 2023
Distribution-Independent Regression for Generalized Linear Models with
  Oblivious Corruptions
Distribution-Independent Regression for Generalized Linear Models with Oblivious CorruptionsAnnual Conference Computational Learning Theory (COLT), 2023
Ilias Diakonikolas
Sushrut Karmalkar
Jongho Park
Christos Tzamos
397
2
0
20 Sep 2023
Gradient-Based Feature Learning under Structured Data
Gradient-Based Feature Learning under Structured DataNeural Information Processing Systems (NeurIPS), 2023
Alireza Mousavi-Hosseini
Denny Wu
Taiji Suzuki
Murat A. Erdogdu
MLT
356
29
0
07 Sep 2023
Max-affine regression via first-order methods
Max-affine regression via first-order methodsSIAM Journal on Mathematics of Data Science (SIMODS), 2023
Seonho Kim
Kiryung Lee
222
3
0
15 Aug 2023
The Effect of SGD Batch Size on Autoencoder Learning: Sparsity,
  Sharpness, and Feature Learning
The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning
Nikhil Ghosh
Spencer Frei
Wooseok Ha
Ting Yu
MLT
329
5
0
06 Aug 2023
On Single Index Models beyond Gaussian Data
On Single Index Models beyond Gaussian DataNeural Information Processing Systems (NeurIPS), 2023
Joan Bruna
Loucas Pillaud-Vivien
Aaron Zweig
302
15
0
28 Jul 2023
Robustly Learning a Single Neuron via Sharpness
Robustly Learning a Single Neuron via SharpnessInternational Conference on Machine Learning (ICML), 2023
Puqian Wang
Nikos Zarifis
Ilias Diakonikolas
Jelena Diakonikolas
207
14
0
13 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias
  for Correlated Inputs
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated InputsNeural Information Processing Systems (NeurIPS), 2023
D. Chistikov
Matthias Englert
R. Lazic
MLT
322
16
0
10 Jun 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks
Expand-and-Cluster: Parameter Recovery of Neural NetworksInternational Conference on Machine Learning (ICML), 2023
Flavio Martinelli
Berfin Simsek
W. Gerstner
Johanni Brea
616
15
0
25 Apr 2023
Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
Finite-Sample Analysis of Learning High-Dimensional Single ReLU NeuronInternational Conference on Machine Learning (ICML), 2023
Jingfeng Wu
Difan Zou
Zixiang Chen
Vladimir Braverman
Quanquan Gu
Sham Kakade
321
9
0
03 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single NeuronAnnual Conference Computational Learning Theory (COLT), 2023
Weihang Xu
S. Du
429
22
0
20 Feb 2023
Continuized Acceleration for Quasar Convex Functions in Non-Convex
  Optimization
Continuized Acceleration for Quasar Convex Functions in Non-Convex OptimizationInternational Conference on Learning Representations (ICLR), 2023
Jun-Kun Wang
Andre Wibisono
252
20
0
15 Feb 2023
Active Learning for Single Neuron Models with Lipschitz Non-Linearities
Active Learning for Single Neuron Models with Lipschitz Non-LinearitiesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Aarshvi Gajjar
Chinmay Hegde
Christopher Musco
478
13
0
24 Oct 2022
SQ Lower Bounds for Learning Single Neurons with Massart Noise
SQ Lower Bounds for Learning Single Neurons with Massart NoiseNeural Information Processing Systems (NeurIPS), 2022
Ilias Diakonikolas
D. Kane
Lisheng Ren
Yuxin Sun
161
8
0
18 Oct 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Magnitude and Angle Dynamics in Training Single ReLU NeuronsNeural Networks (NN), 2022
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
411
6
0
27 Sep 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the
  Computational Limit
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational LimitNeural Information Processing Systems (NeurIPS), 2022
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
470
173
0
18 Jul 2022
Learning a Single Neuron with Adversarial Label Noise via Gradient
  Descent
Learning a Single Neuron with Adversarial Label Noise via Gradient DescentAnnual Conference Computational Learning Theory (COLT), 2022
Ilias Diakonikolas
Vasilis Kontonis
Christos Tzamos
Nikos Zarifis
MLT
219
24
0
17 Jun 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student
  Settings and its Superiority to Kernel Methods
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel MethodsInternational Conference on Learning Representations (ICLR), 2022
Shunta Akiyama
Taiji Suzuki
337
9
0
30 May 2022
Learning a Single Neuron for Non-monotonic Activation Functions
Learning a Single Neuron for Non-monotonic Activation FunctionsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Lei Wu
MLT
218
17
0
16 Feb 2022
Random Feature Amplification: Feature Learning and Generalization in
  Neural Networks
Random Feature Amplification: Feature Learning and Generalization in Neural NetworksJournal of machine learning research (JMLR), 2022
Spencer Frei
Niladri S. Chatterji
Peter L. Bartlett
MLT
345
36
0
15 Feb 2022
Optimization-Based Separations for Neural Networks
Optimization-Based Separations for Neural NetworksAnnual Conference Computational Learning Theory (COLT), 2021
Itay Safran
Jason D. Lee
779
19
0
04 Dec 2021
ReLU Regression with Massart Noise
ReLU Regression with Massart NoiseNeural Information Processing Systems (NeurIPS), 2021
Ilias Diakonikolas
Jongho Park
Christos Tzamos
288
13
0
10 Sep 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks
  Trained by Gradient Descent
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient DescentNeural Information Processing Systems (NeurIPS), 2021
Spencer Frei
Quanquan Gu
396
29
0
25 Jun 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks
  in Teacher-Student Setting
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student SettingInternational Conference on Machine Learning (ICML), 2021
Shunta Akiyama
Taiji Suzuki
MLT
337
16
0
11 Jun 2021
Early-stopped neural networks are consistent
Early-stopped neural networks are consistentNeural Information Processing Systems (NeurIPS), 2021
Ziwei Ji
Justin D. Li
Matus Telgarsky
261
50
0
10 Jun 2021
Learning a Single Neuron with Bias Using Gradient Descent
Learning a Single Neuron with Bias Using Gradient DescentNeural Information Processing Systems (NeurIPS), 2021
Gal Vardi
Gilad Yehudai
Ohad Shamir
MLT
352
22
0
02 Jun 2021
Directional Convergence Analysis under Spherically Symmetric
  Distribution
Directional Convergence Analysis under Spherically Symmetric Distribution
Dachao Lin
Zhihua Zhang
MLT
161
0
0
09 May 2021
Neurons learn slower than they think
Neurons learn slower than they think
I. Kulikovskikh
192
0
0
02 Apr 2021
Painless step size adaptation for SGD
Painless step size adaptation for SGD
I. Kulikovskikh
Tarzan Legović
207
0
0
01 Feb 2021
Implicit Regularization in ReLU Networks with the Square Loss
Implicit Regularization in ReLU Networks with the Square LossAnnual Conference Computational Learning Theory (COLT), 2020
Gal Vardi
Ohad Shamir
292
53
0
09 Dec 2020
How Does the Task Landscape Affect MAML Performance?
How Does the Task Landscape Affect MAML Performance?
Liam Collins
Aryan Mokhtari
Sanjay Shakkottai
409
5
0
27 Oct 2020
Understanding How Over-Parametrization Leads to Acceleration: A case of
  learning a single teacher neuron
Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuronAsian Conference on Machine Learning (ACML), 2020
Jun-Kun Wang
Jacob D. Abernethy
347
1
0
04 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
726
26
0
04 Oct 2020
Statistical-Query Lower Bounds via Functional Gradients
Statistical-Query Lower Bounds via Functional Gradients
Surbhi Goel
Aravind Gollakota
Adam R. Klivans
305
68
0
29 Jun 2020
The Effects of Mild Over-parameterization on the Optimization Landscape
  of Shallow ReLU Neural Networks
The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural NetworksAnnual Conference Computational Learning Theory (COLT), 2020
Itay Safran
Gilad Yehudai
Ohad Shamir
419
41
0
01 Jun 2020
Agnostic Learning of a Single Neuron with Gradient Descent
Agnostic Learning of a Single Neuron with Gradient DescentNeural Information Processing Systems (NeurIPS), 2020
Spencer Frei
Yuan Cao
Quanquan Gu
MLT
460
66
0
29 May 2020
1
Page 1 of 1