Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.10691
Cited By
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
19 November 2022
Ziqiao Wang
Yongyi Mao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States"
14 / 14 papers shown
Title
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Zixuan Gong
Xiaolin Hu
Huayi Tang
Yong Liu
33
0
0
24 Feb 2025
Generalization Bounds via Conditional
f
f
f
-Information
Ziqiao Wang
Yongyi Mao
FedML
40
1
0
30 Oct 2024
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
Xinhao Yao
Hongjin Qian
Xiaolin Hu
Gengze Xu
Wei Liu
Jian Luan
B. Wang
Y. Liu
48
0
0
03 Oct 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
52
0
0
13 Jun 2024
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
Xinhao Yao
Xiaolin Hu
Shenzhi Yang
Yong Liu
39
2
0
06 Jun 2024
Sample-Conditioned Hypothesis Stability Sharpens Information-Theoretic Generalization Bounds
Ziqiao Wang
Yongyi Mao
22
5
0
31 Oct 2023
Tighter Information-Theoretic Generalization Bounds from Supersamples
Ziqiao Wang
Yongyi Mao
19
17
0
05 Feb 2023
Information-Theoretic Analysis of Unsupervised Domain Adaptation
Ziqiao Wang
Yongyi Mao
38
11
0
03 Oct 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
75
89
0
19 May 2022
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications
Ziqiao Wang
Yongyi Mao
FedML
MLT
32
22
0
07 Oct 2021
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
81
72
0
29 Sep 2021
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
Stanislaw Jastrzebski
Devansh Arpit
Oliver Åstrand
Giancarlo Kerg
Huan Wang
Caiming Xiong
R. Socher
Kyunghyun Cho
Krzysztof J. Geras
AI4CE
179
65
0
28 Dec 2020
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
105
146
0
06 Nov 2019
Pac-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning
O. Catoni
139
453
0
03 Dec 2007
1