v1v2v3v4v5 (latest)

Learning-Rate-Free Learning by D-Adaptation

International Conference on Machine Learning (ICML), 2023

18 January 2023

Aaron Defazio

Konstantin Mishchenko

ArXiv (abs)PDF HTML

Papers citing "Learning-Rate-Free Learning by D-Adaptation"

28 / 78 papers shown

Title
Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence Yichuan Deng Zhao Song Chiwun Yang 108 1 0 02 Feb 2024
Stochastic Weakly Convex Optimization Beyond Lipschitz ContinuityInternational Conference on Machine Learning (ICML), 2024 Wenzhi Gao Qi Deng 134 6 0 25 Jan 2024
Masked Audio Generation using a Single Non-Autoregressive TransformerInternational Conference on Learning Representations (ICLR), 2024 Alon Ziv Itai Gat Gaël Le Lan Tal Remez Felix Kreuk Alexandre Défossez Jade Copet Gabriel Synnaeve Yossi Adi 298 59 0 09 Jan 2024
Interpreting Adaptive Gradient Methods by Parameter Scaling for Learning-Rate-Free Optimization Min-Kook Suh Seung-Woo Seo ODL 178 0 0 06 Jan 2024
Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted CameraComputer Vision and Pattern Recognition (CVPR), 2024 Jiye Lee Hanbyul Joo 208 27 0 01 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms Farshed Abdukhakimov Chulu Xiang Dmitry Kamzolov Robert Mansel Gower Martin Takáč 228 5 0 28 Dec 2023
Locally Optimal Descent for Dynamic Stepsize SchedulingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 Gilad Yehudai Alon Cohen Amit Daniely Yoel Drori Tomer Koren Mariano Schain 190 0 0 23 Nov 2023
Non-Uniform Smoothness for Gradient Descent A. Berahas Lindon Roberts Fred Roosta 143 5 0 15 Nov 2023
An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent Zhao Song Chiwun Yang 201 10 0 17 Oct 2023
A simple uniformly optimal method without line search for convex optimization Tianjiao Li Guanghui Lan 309 38 0 16 Oct 2023
Multiple Physics Pretraining for Physical Surrogate Models Michael McCabe Bruno Régaldo-Saint Blancard Liam Parker Ruben Ohana M. Cranmer ... Francois Lanusse Mariel Pettee Tiberiu Teşileanu Kyunghyun Cho Shirley Ho PINN AI4CE 220 81 0 04 Oct 2023
Small-scale proxies for large-scale Transformer training instabilitiesInternational Conference on Learning Representations (ICLR), 2023 Mitchell Wortsman Peter J. Liu Lechao Xiao Katie Everett A. Alemi ... Jascha Narain Sohl-Dickstein Kelvin Xu Jaehoon Lee Justin Gilmer Simon Kornblith 227 131 0 25 Sep 2023
Ego3DPose: Capturing 3D Cues from Binocular Egocentric ViewsACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023 Taeho Kang Kyungjin Lee Jinrui Zhang Youngki Lee EgoV 409 22 0 21 Sep 2023
Learning-Rate-Free Learning: Dissecting D-Adaptation and Probabilistic Line Search Max McGuinness ODL 82 0 0 06 Aug 2023
Adaptive Proximal Gradient Method for Convex OptimizationNeural Information Processing Systems (NeurIPS), 2023 Yura Malitsky Konstantin Mishchenko 215 50 0 04 Aug 2023
Oblivious Stochastic Composite Optimization Clément Lezane Cristóbal Guzmán 143 0 0 30 Jun 2023
Adaptive Federated Learning with Auto-Tuned ClientsInternational Conference on Learning Representations (ICLR), 2023 Junhyung Lyle Kim Taha Toghani César A. Uribe Anastasios Kyrillidis FedML 453 12 0 19 Jun 2023
Towards Stability of Autoregressive Neural Operators Michael McCabe P. Harrington Shashank Subramanian Jed Brown AI4CE 361 33 0 18 Jun 2023
Prodigy: An Expeditiously Adaptive Parameter-Free LearnerInternational Conference on Machine Learning (ICML), 2023 Konstantin Mishchenko Aaron Defazio ODL 326 101 0 09 Jun 2023
Simple and Controllable Music GenerationNeural Information Processing Systems (NeurIPS), 2023 Jade Copet Felix Kreuk Itai Gat Tal Remez David Kant Gabriel Synnaeve Yossi Adi Alexandre Défossez MGen 341 562 0 08 Jun 2023
Mechanic: A Learning Rate TunerNeural Information Processing Systems (NeurIPS), 2023 Ashok Cutkosky Aaron Defazio Harsh Mehta OffRL 339 21 0 31 May 2023
Parameter-free projected gradient descent Evgenii Chzhen Christophe Giraud Jean-Michel Poggi 160 4 0 31 May 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent MethodNeural Information Processing Systems (NeurIPS), 2023 Ahmed Khaled Konstantin Mishchenko Chi Jin ODL 290 38 0 25 May 2023
Stochastic Ratios Tracking Algorithm for Large Scale Machine Learning Problems Shigeng Sun Yuchen Xie 89 3 0 17 May 2023
MoMo: Momentum Models for Adaptive Learning RatesInternational Conference on Machine Learning (ICML), 2023 Fabian Schaipp Ruben Ohana Michael Eickenberg Aaron Defazio Robert Mansel Gower 277 19 0 12 May 2023
Random Function DescentNeural Information Processing Systems (NeurIPS), 2023 Felix Benning L. Döring 152 1 0 02 May 2023
Stochastic Nonsmooth Convex Optimization with Heavy-Tailed Noises: High-Probability Bound, In-Expectation Rate and Initial Distance Adaptation Zijian Liu Zhengyuan Zhou 211 18 0 22 Mar 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size ScheduleInternational Conference on Machine Learning (ICML), 2023 Maor Ivgi Oliver Hinder Y. Carmon ODL 399 86 0 08 Feb 2023