Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning
and Autoregression

Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression

17 October 2023

Dylan J. Foster

Akshay Krishnamurthy

Max Simchowitz

Papers citing "Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression"

10 / 10 papers shown

Title
FOLDER: Accelerating Multi-modal Large Language Models with Enhanced Performance Haicheng Wang Zhemeng Yu Gabriele Spadaro Chen Ju Victor Quétu Enzo Tartaglione Enzo Tartaglione VLM 66 3 0 05 Jan 2025
TinyGSM: achieving >80% on GSM8k with small language models Bingbin Liu Sébastien Bubeck Ronen Eldan Janardhan Kulkarni Yuanzhi Li Anh Nguyen Rachel A. Ward Yi Zhang ALM 16 47 0 14 Dec 2023
Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions Adam Block Max Simchowitz 36 11 0 25 May 2022
Preference Dynamics Under Personalized Recommendations Sarah Dean Jamie Morgenstern 58 34 0 25 May 2022
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms Sadhika Malladi Kaifeng Lyu A. Panigrahi Sanjeev Arora 88 26 0 20 May 2022
Stabilizing Dynamical Systems via Policy Gradient Methods Juan C. Perdomo Jack Umenberger Max Simchowitz 11 44 0 13 Oct 2021
On the Power of Differentiable Learning versus PAC and SQ Learning Emmanuel Abbe Pritish Kamath Eran Malach Colin Sandon Nathan Srebro MLT 45 22 0 09 Aug 2021
On the Sample Complexity of Stability Constrained Imitation Learning Stephen Tu Alexander Robey Tingnan Zhang Nikolai Matni 38 35 0 18 Feb 2021
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics D. Kunin Javier Sagastuy-Breña Surya Ganguli Daniel L. K. Yamins Hidenori Tanaka 97 77 0 08 Dec 2020
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method Simon Lacoste-Julien Mark W. Schmidt Francis R. Bach 109 253 0 10 Dec 2012