Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.11428
Cited By
Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression
17 October 2023
Adam Block
Dylan J. Foster
Akshay Krishnamurthy
Max Simchowitz
Cyril Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression"
10 / 10 papers shown
Title
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance
Haicheng Wang
Zhemeng Yu
Gabriele Spadaro
Chen Ju
Victor Quétu
Enzo Tartaglione
Enzo Tartaglione
VLM
66
3
0
05 Jan 2025
TinyGSM: achieving >80% on GSM8k with small language models
Bingbin Liu
Sébastien Bubeck
Ronen Eldan
Janardhan Kulkarni
Yuanzhi Li
Anh Nguyen
Rachel A. Ward
Yi Zhang
ALM
16
47
0
14 Dec 2023
Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions
Adam Block
Max Simchowitz
36
11
0
25 May 2022
Preference Dynamics Under Personalized Recommendations
Sarah Dean
Jamie Morgenstern
58
34
0
25 May 2022
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
Sadhika Malladi
Kaifeng Lyu
A. Panigrahi
Sanjeev Arora
88
26
0
20 May 2022
Stabilizing Dynamical Systems via Policy Gradient Methods
Juan C. Perdomo
Jack Umenberger
Max Simchowitz
11
44
0
13 Oct 2021
On the Power of Differentiable Learning versus PAC and SQ Learning
Emmanuel Abbe
Pritish Kamath
Eran Malach
Colin Sandon
Nathan Srebro
MLT
45
22
0
09 Aug 2021
On the Sample Complexity of Stability Constrained Imitation Learning
Stephen Tu
Alexander Robey
Tingnan Zhang
Nikolai Matni
38
35
0
18 Feb 2021
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
97
77
0
08 Dec 2020
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
109
253
0
10 Dec 2012
1