A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 334 papers shown

Title
Learning Guarantee of Reward Modeling Using Deep Neural Networks Yuanhang Luo Yeheng Ge Ruijian Han Guohao Shen 34 0 0 10 May 2025
Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime Francesco Camilli D. Tieplova Eleonora Bergamin Jean Barbier 135 0 0 06 May 2025
LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection Xinyue Zeng Haohui Wang Junhong Lin Jun Wu Tyler Cody Dawei Zhou 118 0 0 01 May 2025
Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime Raphael Barboni Gabriel Peyré François-Xavier Vialard MLT 39 0 0 25 Apr 2025
Statistically guided deep learning Michael Kohler A. Krzyżak ODL BDL 79 0 0 11 Apr 2025
Fractal and Regular Geometry of Deep Neural Networks Simmaco Di Lillo Domenico Marinucci Michele Salvi Stefano Vigogna MDE AI4CE 36 0 0 08 Apr 2025
On the Cone Effect in the Learning Dynamics Zhanpeng Zhou Yongyi Yang Jie Ren Mahito Sugiyama Junchi Yan 53 0 0 20 Mar 2025
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation Yang Cao Zhao Song Chiwun Yang VGen 46 2 0 01 Feb 2025
Learn Sharp Interface Solution by Homotopy Dynamics Chuqi Chen Yahong Yang Yang Xiang Wenrui Hao ODL 59 1 0 01 Feb 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input Ziang Chen Rong Ge MLT 61 1 0 10 Jan 2025
Fast Gradient Computation for RoPE Attention in Almost Linear Time Yifang Chen Jiayan Huo Xiaoyu Li Yingyu Liang Zhenmei Shi Zhao Song 61 12 0 03 Jan 2025
Optimization Insights into Deep Diagonal Linear Networks Hippolyte Labarrière C. Molinari Lorenzo Rosasco S. Villa Cristian Vega 76 0 0 21 Dec 2024
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Akhiad Bercovich Tomer Ronen Talor Abramovich Nir Ailon Nave Assaf ... Ido Shahaf Oren Tropp Omer Ullman Argov Ran Zilberstein Ran El-Yaniv 77 1 0 28 Nov 2024
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 132 0 0 08 Nov 2024
Interplay between Federated Learning and Explainable Artificial Intelligence: a Scoping Review Luis M. Lopez-Ramos Florian Leiser Aditya Rastogi Steven Hicks Inga Strümke V. Madai Tobias Budig Ali Sunyaev A. Hilbert 30 1 0 07 Nov 2024
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data Binghui Li Yuanzhi Li OOD 33 2 0 11 Oct 2024
On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory Guhan Chen Yicheng Li Qian Lin AAML 38 1 0 08 Oct 2024
Extended convexity and smoothness and their applications in deep learning Binchuan Qi Wei Gong Li Li 63 0 0 08 Oct 2024
SHAP values via sparse Fourier representation Ali Gorji Andisheh Amrollahi A. Krause FAtt 38 0 0 08 Oct 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 38 3 0 22 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 54 1 0 18 Sep 2024
Monomial Matrix Group Equivariant Neural Functional Networks Hoang V. Tran Thieu N. Vo Tho H. Tran An T. Nguyen Tan M. Nguyen 54 5 0 18 Sep 2024
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning Arthur Jacot Seok Hoan Choi Yuxiao Wen AI4CE 91 2 0 08 Jul 2024
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation Lu Li Tianze Zhang Zhiqi Bu Suyuchen Wang Huan He Jie Fu Yonghui Wu Jiang Bian Yong Chen Yoshua Bengio FedML MoMe 100 3 0 11 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees A. Banerjee Qiaobo Li Yingxue Zhou 49 0 0 11 Jun 2024
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation Can Yaras Peng Wang Laura Balzano Qing Qu AI4CE 37 12 0 06 Jun 2024
Reparameterization invariance in approximate Bayesian inference Hrittik Roy M. Miani Carl Henrik Ek Philipp Hennig Marvin Pfortner Lukas Tatzel Søren Hauberg BDL 47 8 0 05 Jun 2024
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent Michael Kohler A. Krzyżak Benjamin Walter 36 1 0 13 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks Zhifa Ke Zaiwen Wen Junyu Zhang 37 0 0 07 May 2024
Machine Unlearning via Null Space Calibration Huiqiang Chen Tianqing Zhu Xin Yu Wanlei Zhou 41 6 0 21 Apr 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Matteo Tucat Anirbit Mukherjee Procheta Sen Mingfei Sun Omar Rivasplata MLT 39 1 0 12 Apr 2024
CAM-Based Methods Can See through Walls Magamed Taimeskhanov R. Sicre Damien Garreau 21 1 0 02 Apr 2024
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport Raphael Barboni Gabriel Peyré Franccois-Xavier Vialard 37 3 0 19 Mar 2024
NTK-Guided Few-Shot Class Incremental Learning Jingren Liu Zhong Ji Yanwei Pang YunLong Yu CLL 39 3 0 19 Mar 2024
Anytime Neural Architecture Search on Tabular Data Naili Xing Shaofeng Cai Zhaojing Luo Bengchin Ooi Jian Pei 34 1 0 15 Mar 2024
$How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance$ How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance Hongkang Li Shuai Zhang Yihua Zhang Meng Wang Sijia Liu Pin-Yu Chen 41 4 0 12 Mar 2024
Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural networks Arnulf Jentzen Adrian Riekert 38 4 0 07 Feb 2024
Architectural Strategies for the optimization of Physics-Informed Neural Networks Hemanth Saratchandran Shin-Fang Chng Simon Lucey AI4CE 39 0 0 05 Feb 2024
Uncertainty-Aware Explainable Recommendation with Large Language Models Yicui Peng Hao Chen C. Lin Guo Huang Jinrong Hu Hui Guo Bin Kong Shu Hu Xi Wu Xin Wang LRM 58 8 0 31 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems Ori Shem-Ur Yaron Oz 19 0 0 08 Jan 2024
Analysis of the expected $L_2$ error of an over-parametrized deep neural network estimate learned by gradient descent without regularization Selina Drews Michael Kohler 36 2 0 24 Nov 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD Aritra Dutta El Houcine Bergou Soumia Boucherouite Nicklas Werge M. Kandemir Xin Li 26 0 0 19 Oct 2023
Differentially Private Non-convex Learning for Multi-layer Neural Networks Hanpu Shen Cheng-Long Wang Zihang Xiang Yiming Ying Di Wang 49 7 0 12 Oct 2023
Sparse Deep Learning for Time Series Data: Theory and Applications Mingxuan Zhang Y. Sun Faming Liang AI4TS OOD BDL 39 2 0 05 Oct 2023
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss T. Getu Georges Kaddoum M. Bennis 40 1 0 13 Sep 2023
How to Protect Copyright Data in Optimization of Large Language Models? T. Chu Zhao Song Chiwun Yang 40 29 0 23 Aug 2023
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference Edouard Yvinec Arnaud Dapogny Kévin Bailly Xavier Fischer AAML 10 2 0 09 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers Chao Zhang Xinyuan Chen Wensheng Li Lixue Liu Wei Wu Dacheng Tao 28 3 0 26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification Lianke Qin Zhao Song Yuanyuan Yang 25 9 0 13 Jul 2023
Test-Time Training on Video Streams Renhao Wang Yu Sun Yossi Gandelsman Xinlei Chen Alexei A. Efros Alexei A. Efros Xiaolong Wang TTA ViT 3DGS 41 16 0 11 Jul 2023