ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08246
  4. Cited By
Characterizing Implicit Bias in Terms of Optimization Geometry

Characterizing Implicit Bias in Terms of Optimization Geometry

22 February 2018
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
    AI4CE
ArXivPDFHTML

Papers citing "Characterizing Implicit Bias in Terms of Optimization Geometry"

50 / 72 papers shown
Title
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias
Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias
Yura Malitsky
Alexander Posch
19
0
0
05 May 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
59
0
0
11 Apr 2025
Theory on Mixture-of-Experts in Continual Learning
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li
Sen-Fon Lin
Lingjie Duan
Yingbin Liang
Ness B. Shroff
MoE
MoMe
CLL
151
14
0
20 Feb 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
55
8
0
20 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
120
0
0
08 Feb 2025
Optimization Insights into Deep Diagonal Linear Networks
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
76
0
0
21 Dec 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
83
1
0
29 Aug 2024
Mask in the Mirror: Implicit Sparsification
Mask in the Mirror: Implicit Sparsification
Tom Jacobs
R. Burkholz
40
3
0
19 Aug 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot
Seok Hoan Choi
Yuxiao Wen
AI4CE
86
2
0
08 Jul 2024
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
Arthur Jacot
Alexandre Kaiser
36
0
0
27 May 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
65
5
0
26 May 2024
Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm
  Regularization
Hidden Synergy: L1L_1L1​ Weight Normalization and 1-Path-Norm Regularization
Aditya Biswas
36
0
0
29 Apr 2024
High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile
High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile
Jérémie Bigot
Issa-Mbenard Dabo
Camille Male
29
4
0
29 Mar 2024
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning
Yuxiao Wen
Arthur Jacot
47
6
0
12 Feb 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
39
1
0
29 Nov 2023
Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models
Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models
David X. Wu
A. Sahai
21
2
0
23 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer
  Linear Convolutional Neural Networks
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
29
5
0
20 Jun 2023
Unraveling Projection Heads in Contrastive Learning: Insights from
  Expansion and Shrinkage
Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Yu Gui
Cong Ma
Yiqiao Zhong
17
6
0
06 Jun 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
  Stability
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
24
17
0
19 May 2023
Robust Implicit Regularization via Weight Normalization
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
28
7
0
09 May 2023
General Loss Functions Lead to (Approximate) Interpolation in High
  Dimensions
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai
Vidya Muthukumar
16
5
0
13 Mar 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
13
7
0
19 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear
  Regression
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou
Rong Ge
27
2
0
01 Feb 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
28
47
0
30 Jan 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained
  Analysis of Matrix Sensing
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
40
34
0
27 Jan 2023
Tight bounds for maximum $\ell_1$-margin classifiers
Tight bounds for maximum ℓ1\ell_1ℓ1​-margin classifiers
Stefan Stojanovic
Konstantin Donhauser
Fanny Yang
29
0
0
07 Dec 2022
Regression as Classification: Influence of Task Formulation on Neural
  Network Features
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
27
24
0
10 Nov 2022
Stochastic Mirror Descent in Average Ensemble Models
Stochastic Mirror Descent in Average Ensemble Models
Taylan Kargin
Fariborz Salehi
B. Hassibi
11
1
0
27 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic
  Gradient Descent
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
19
4
0
13 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Niladri S. Chatterji
Philip M. Long
13
8
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
30
72
0
26 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
32
27
0
08 Jul 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso
  for quadratic parametrisation
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
31
31
0
20 Jun 2022
Reconstructing Training Data from Trained Neural Networks
Reconstructing Training Data from Trained Neural Networks
Niv Haim
Gal Vardi
Gilad Yehudai
Ohad Shamir
Michal Irani
27
132
0
15 Jun 2022
Thinking Outside the Ball: Optimal Learning with Gradient Descent for
  Generalized Linear Stochastic Convex Optimization
Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization
I Zaghloul Amir
Roi Livni
Nathan Srebro
22
6
0
27 Feb 2022
Benign Overfitting in Adversarially Robust Linear Classification
Benign Overfitting in Adversarially Robust Linear Classification
Jinghui Chen
Yuan Cao
Quanquan Gu
AAML
SILM
26
10
0
31 Dec 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows
  Converge to Extreme Points of the Dual Convex Program
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
Yifei Wang
Mert Pilanci
MLT
MDE
47
11
0
13 Oct 2021
Implicit Bias of Linear Equivariant Networks
Implicit Bias of Linear Equivariant Networks
Hannah Lawrence
Kristian Georgiev
A. Dienes
B. Kiani
AI4CE
32
14
0
12 Oct 2021
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
45
28
0
06 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in
  Generalization
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
20
28
0
06 Oct 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers
A Theoretical Analysis of Fine-tuning with Linear Teachers
Gal Shachaf
Alon Brutzkus
Amir Globerson
26
17
0
04 Jul 2021
What can linearized neural networks actually say about generalization?
What can linearized neural networks actually say about generalization?
Guillermo Ortiz-Jiménez
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
21
43
0
12 Jun 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal
  Mirror Descent
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Shahar Azulay
E. Moroshko
Mor Shpigel Nacson
Blake E. Woodworth
Nathan Srebro
Amir Globerson
Daniel Soudry
AI4CE
25
73
0
19 Feb 2021
Obtaining Adjustable Regularization for Free via Iterate Averaging
Obtaining Adjustable Regularization for Free via Iterate Averaging
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
19
2
0
15 Aug 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs
  Training Accuracy
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
J. Lee
Nathan Srebro
Daniel Soudry
16
85
0
13 Jul 2020
When Does Preconditioning Help or Hurt Generalization?
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
26
32
0
18 Jun 2020
Neural Anisotropy Directions
Neural Anisotropy Directions
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
26
16
0
17 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
18
93
0
15 Jun 2020
To Each Optimizer a Norm, To Each Norm its Generalization
To Each Optimizer a Norm, To Each Norm its Generalization
Sharan Vaswani
Reza Babanezhad
Jose Gallego
Aaron Mishkin
Simon Lacoste-Julien
Nicolas Le Roux
13
8
0
11 Jun 2020
12
Next