Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

30 May 2019

Quanquan Gu

Papers citing "Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks"

50 / 112 papers shown

Title
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 132 0 0 08 Nov 2024
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods Hossein Taheri Christos Thrampoulidis Arya Mazumdar MLT 36 0 0 13 Oct 2024
A Cost-Aware Approach to Adversarial Robustness in Neural Networks Charles Meyers Mohammad Reza Saleh Sedghpour Tommy Löfstedt Erik Elmroth OOD AAML 33 0 0 11 Sep 2024
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks Zhifa Ke Zaiwen Wen Junyu Zhang 37 0 0 07 May 2024
$How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance$ How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li Shuai Zhang Yihua Zhang Meng Wang Sijia Liu Pin-Yu Chen 41 4 0 12 Mar 2024
Implicit Bias and Fast Convergence Rates for Self-attention Bhavya Vasudeva Puneesh Deora Christos Thrampoulidis 34 13 0 08 Feb 2024
Regularized Q-Learning with Linear Function Approximation Jiachen Xi Alfredo Garcia P. Momcilovic 38 2 0 26 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems Ori Shem-Ur Yaron Oz 19 0 0 08 Jan 2024
Gradual Domain Adaptation: Theory and Algorithms Yifei He Haoxiang Wang Bo Li Han Zhao CLL 52 6 0 20 Oct 2023
Differentially Private Non-convex Learning for Multi-layer Neural Networks Hanpu Shen Cheng-Long Wang Zihang Xiang Yiming Ying Di Wang 49 7 0 12 Oct 2023
ROMO: Retrieval-enhanced Offline Model-based Optimization Mingcheng Chen Haoran Zhao Yuxiang Zhao Hulei Fan Hongqiao Gao Yong Yu Zheng Tian OffRL 18 1 0 11 Oct 2023
How to Protect Copyright Data in Optimization of Large Language Models? T. Chu Zhao Song Chiwun Yang 40 29 0 23 Aug 2023
Modify Training Directions in Function Space to Reduce Generalization Error Yi Yu Wenlian Lu Boyu Chen 27 0 0 25 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification Lianke Qin Zhao Song Yuanyuan Yang 25 9 0 13 Jul 2023
Training-Free Neural Active Learning with Initialization-Robustness Guarantees Apivich Hemachandra Zhongxiang Dai Jasraj Singh See-Kiong Ng K. H. Low AAML 36 6 0 07 Jun 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks Puyu Wang Yunwen Lei Di Wang Yiming Ying Ding-Xuan Zhou MLT 29 3 0 26 May 2023
Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension Moritz Haas David Holzmüller U. V. Luxburg Ingo Steinwart MLT 35 14 0 23 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data Hossein Taheri Christos Thrampoulidis MLT 16 3 0 22 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks Eshaan Nichani Alexandru Damian Jason D. Lee MLT 44 13 0 11 May 2023
Neural Exploitation and Exploration of Contextual Bandits Yikun Ban Yuchen Yan A. Banerjee Jingrui He 42 8 0 05 May 2023
Wide neural networks: From non-gaussian random fields at initialization to the NTK geometry of training Luís Carvalho Joao L. Costa José Mourao Gonccalo Oliveira AI4CE 26 1 0 06 Apr 2023
Generalization analysis of an unfolding network for analysis-based Compressed Sensing Vicky Kouni Yannis Panagakis MLT 26 0 0 09 Mar 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation Thanh Nguyen-Tang R. Arora OffRL 46 5 0 24 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 37 16 0 20 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li Hao Wu Sijia Liu Pin-Yu Chen ViT MLT 37 57 0 12 Feb 2023
Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions Anant Raj Lingjiong Zhu Mert Gurbuzbalaban Umut Simsekli 31 15 0 27 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 53 11 0 30 Dec 2022
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs Chenxiao Yang Qitian Wu Jiahua Wang Junchi Yan AI4CE 19 51 0 18 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing Josh Alman Jiehao Liang Zhao Song Ruizhe Zhang Danyang Zhuo 77 31 0 25 Nov 2022
Cold Start Streaming Learning for Deep Networks Cameron R. Wolfe Anastasios Kyrillidis CLL 20 2 0 09 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 27 5 0 28 Oct 2022
Learning-based Design of Luenberger Observers for Autonomous Nonlinear Systems Muhammad Umar B. Niazi Johnson R. Cao Xu-yang Sun Amritam Das Karl H. Johansson OOD 15 22 0 04 Oct 2022
On the optimization and generalization of overparameterized implicit neural networks Tianxiang Gao Hongyang Gao MLT AI4CE 19 3 0 30 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher 39 19 0 15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher AI4CE 28 15 0 15 Sep 2022
Towards Understanding Mixture of Experts in Deep Learning Zixiang Chen Yihe Deng Yue-bo Wu Quanquan Gu Yuan-Fang Li MLT MoE 27 53 0 04 Aug 2022
Graph Neural Network Bandits Parnian Kassraie Andreas Krause Ilija Bogunovic 26 11 0 13 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity Jianyi Yang Shaolei Ren 32 3 0 02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis Alexander Munteanu Simon Omlor Zhao Song David P. Woodruff 30 15 0 26 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction Kaifeng Lyu Zhiyuan Li Sanjeev Arora FAtt 40 70 0 14 Jun 2022
Meet You Halfway: Explaining Deep Learning Mysteries Oriel BenShmuel AAML FedML FAtt OOD 27 0 0 09 Jun 2022
Wavelet Regularization Benefits Adversarial Training Jun Yan Huilin Yin Xiaoyang Deng Zi-qin Zhao Wancheng Ge Hao Zhang Gerhard Rigoll AAML 19 2 0 08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials Eshaan Nichani Yunzhi Bai Jason D. Lee 29 10 0 08 Jun 2022
Sobolev Acceleration and Statistical Optimality for Learning Elliptic Equations via Gradient Descent Yiping Lu Jose H. Blanchet Lexing Ying 38 7 0 15 May 2022
Convergence of gradient descent for deep neural networks S. Chatterjee ODL 21 20 0 30 Mar 2022
Global Convergence of MAML and Theory-Inspired Neural Architecture Search for Few-Shot Learning Haoxiang Wang Yite Wang Ruoyu Sun Bo-wen Li 29 27 0 17 Mar 2022
Explicitising The Implicit Intrepretability of Deep Neural Networks Via Duality Chandrashekar Lakshminarayanan Ashutosh Kumar Singh A. Rajkumar AI4CE 26 1 0 01 Mar 2022
Benefit of Interpolation in Nearest Neighbor Algorithms Yue Xing Qifan Song Guang Cheng 11 28 0 23 Feb 2022
Demystify Optimization and Generalization of Over-parameterized PAC-Bayesian Learning Wei Huang Chunrui Liu Yilan Chen Tianyu Liu R. Xu BDL MLT 19 2 0 04 Feb 2022
Deep Layer-wise Networks Have Closed-Form Weights Chieh-Tsai Wu A. Masoomi Arthur Gretton Jennifer Dy 29 3 0 01 Feb 2022