Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

24 January 2019

Papers citing "Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks"

50 / 192 papers shown

Title
Mallows-type model averaging: Non-asymptotic analysis and all-subset combination Jingfu Peng MoMe 37 0 0 05 May 2025
A Comprehensive Survey of Synthetic Tabular Data Generation Ruxue Shi Yili Wang Mengnan Du Xu Shen Xin Wang 44 2 0 23 Apr 2025
Explainable Neural Networks with Guarantees: A Sparse Estimation Approach Antoine Ledent Peng Liu FAtt 102 0 0 20 Feb 2025
MLPs at the EOC: Dynamics of Feature Learning Dávid Terjék MLT 41 0 0 18 Feb 2025
SNeRV: Spectra-preserving Neural Representation for Video Jina Kim Jihoo Lee Je-Won Kang 35 3 0 03 Jan 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 97 0 0 08 Nov 2024
Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning Jingyang Li Jiachun Pan Vincent Y. F. Tan Kim-Chuan Toh Pan Zhou AAML MLT 43 0 0 15 Oct 2024
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data Binghui Li Yuanzhi Li OOD 28 2 0 11 Oct 2024
Extended convexity and smoothness and their applications in deep learning Binchuan Qi Wei Gong Li Li 61 0 0 08 Oct 2024
Tuning Frequency Bias of State Space Models Annan Yu Dongwei Lyu S. H. Lim Michael W. Mahoney N. Benjamin Erichson 38 2 0 02 Oct 2024
Optimal Kernel Quantile Learning with Random Features Caixing Wang Xingdong Feng 40 0 0 24 Aug 2024
Many Perception Tasks are Highly Redundant Functions of their Input Data Rahul Ramesh Anthony Bisulco Ronald W. DiTullio Linran Wei Vijay Balasubramanian Kostas Daniilidis Pratik Chaudhari 36 2 0 18 Jul 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees A. Banerjee Qiaobo Li Yingxue Zhou 44 0 0 11 Jun 2024
On the Rashomon ratio of infinite hypothesis sets Evzenie Coupkova Mireille Boutin 29 1 0 27 Apr 2024
$Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization$ Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization Shuo Xie Zhiyuan Li OffRL 35 12 0 05 Apr 2024
NTK-Guided Few-Shot Class Incremental Learning Jingren Liu Zhong Ji Yanwei Pang Yunlong Yu CLL 34 3 0 19 Mar 2024
$How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance$ How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li Shuai Zhang Yihua Zhang Meng Wang Sijia Liu Pin-Yu Chen 33 4 0 12 Mar 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 38 0 0 08 Feb 2024
Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural networks Arnulf Jentzen Adrian Riekert 33 4 0 07 Feb 2024
Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum Tin Sum Cheng Aurélien Lucchi Anastasis Kratsios David Belius 32 7 0 02 Feb 2024
$\emph{Lifted} RDT based capacity analysis of the 1-hidden layer treelike \emph{sign} perceptrons neural networks$ \emph{Lifted} RDT based capacity analysis of the 1-hidden layer treelike \emph{sign} perceptrons neural networks M. Stojnic 22 1 0 13 Dec 2023
Capacity of the treelike sign perceptrons neural networks with one hidden layer -- RDT based upper bounds M. Stojnic 16 4 0 13 Dec 2023
Gradual Domain Adaptation: Theory and Algorithms Yifei He Haoxiang Wang Bo Li Han Zhao CLL 49 5 0 20 Oct 2023
How to Protect Copyright Data in Optimization of Large Language Models? T. Chu Zhao-quan Song Chiwun Yang 32 29 0 23 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers Chao Zhang Xinyuan Chen Wensheng Li Lixue Liu Wei Wu Dacheng Tao 16 3 0 26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification Lianke Qin Zhao-quan Song Yuanyuan Yang 20 9 0 13 Jul 2023
Training-Free Neural Active Learning with Initialization-Robustness Guarantees Apivich Hemachandra Zhongxiang Dai Jasraj Singh See-Kiong Ng K. H. Low AAML 30 6 0 07 Jun 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks Puyu Wang Yunwen Lei Di Wang Yiming Ying Ding-Xuan Zhou MLT 24 3 0 26 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data Hossein Taheri Christos Thrampoulidis MLT 16 3 0 22 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks Eshaan Nichani Alexandru Damian Jason D. Lee MLT 36 13 0 11 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains Yicheng Li Zixiong Yu Y. Cotronis Qian Lin 53 13 0 04 May 2023
Wide neural networks: From non-gaussian random fields at initialization to the NTK geometry of training Luís Carvalho Joao L. Costa José Mourao Gonccalo Oliveira AI4CE 13 1 0 06 Apr 2023
On the Stepwise Nature of Self-Supervised Learning James B. Simon Maksis Knutins Liu Ziyin Daniel Geisz Abraham J. Fetterman Joshua Albrecht SSL 26 29 0 27 Mar 2023
Online Learning for the Random Feature Model in the Student-Teacher Framework Roman Worschech B. Rosenow 36 0 0 24 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 26 16 0 20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to Nonlinear Jihao Long Jiequn Han 19 5 0 20 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li M. Wang Sijia Liu Pin-Yu Chen ViT MLT 35 56 0 12 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels Simone Bombari Shayan Kiyani Marco Mondelli AAML 28 10 0 03 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 59 2 0 02 Feb 2023
Supervision Complexity and its Role in Knowledge Distillation Hrayr Harutyunyan A. S. Rawat A. Menon Seungyeon Kim Surinder Kumar 22 12 0 28 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Guihong Li Yuedong Yang Kartikeya Bhardwaj R. Marculescu 28 60 0 26 Jan 2023
Convergence beyond the over-parameterized regime using Rayleigh quotients David A. R. Robin Kevin Scaman Marc Lelarge 17 3 0 19 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 45 11 0 30 Dec 2022
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks Ilja Kuzborskij Csaba Szepesvári 21 4 0 28 Dec 2022
Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features Qingrui Jia Xuhong Li Lei Yu Jiang Bian Penghao Zhao Shupeng Li Haoyi Xiong Dejing Dou NoLa 22 5 0 19 Dec 2022
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 31 36 0 14 Dec 2022
Leveraging Unlabeled Data to Track Memorization Mahsa Forouzesh Hanie Sedghi Patrick Thiran NoLa TDI 30 3 0 08 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing Josh Alman Jiehao Liang Zhao-quan Song Ruizhe Zhang Danyang Zhuo 69 31 0 25 Nov 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States Ziqiao Wang Yongyi Mao 13 10 0 19 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion Michael Murray Hui Jin Benjamin Bowman Guido Montúfar 30 11 0 15 Nov 2022