Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.15682
Cited By
The Road Less Scheduled
24 May 2024
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Road Less Scheduled"
43 / 43 papers shown
Title
Deep Learning-Based Robust Optical Guidance for Hypersonic Platforms
Adrien Chan-Hon-Tong
A. Plyer
Baptiste Cadalen
Laurent Serre
16
0
0
09 May 2025
Plasma State Monitoring and Disruption Characterization using Multimodal VAEs
Y. Poels
Alessandro Pau
Christian Donner
Giulio Romanelli
Olivier Sauter
Cristina Venturini
Vlado Menkovski
TCV Team
WPTE team
24
0
0
24 Apr 2025
Tabular foundation model to detect empathy from visual cues
M. Hasan
Shafin Rahman
M. Hossain
Aneesh Krishna
Tom Gedeon
VLM
21
0
0
15 Apr 2025
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications
Masquil Elías
Marí Roger
Ehret Thibaud
Meinhardt-Llopis Enric
Musé Pablo
Facciolo Gabriele
MDE
24
0
0
09 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
59
0
0
02 Apr 2025
Why risk matters for protein binder design
Tudor-Stefan Cotet
Igor Krawczuk
41
0
0
31 Mar 2025
GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain
Vida Adeli
Soroush Mehraban
Majid Mirmehdi
Alan Whone
Benjamin Filtjens
Amirhossein Dadashzadeh
A. Fasano
Andrea Iaboni Babak Taati
44
0
0
28 Mar 2025
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
Leander Kurscheidt
Paolo Morettin
Roberto Sebastiani
Andrea Passerini
Antonio Vergari
55
0
0
25 Mar 2025
MaskSDM with Shapley values to improve flexibility, robustness, and explainability in species distribution modeling
Robin Zbinden
Nina Van Tiel
Gencer Sumbul
Chiara Vanalli
B. Kellenberger
D. Tuia
39
0
0
17 Mar 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
59
1
0
17 Mar 2025
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation
Jiawei Zhang
Ziyuan Liu
Leon Yan
Gen Li
Yuantao Gu
54
0
0
13 Mar 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization
Amit Attia
Tomer Koren
56
1
0
13 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Paul Janson
Vaibhav Singh
Paria Mehrbod
Adam Ibrahim
Irina Rish
Eugene Belilovsky
Benjamin Thérien
CLL
73
0
0
04 Mar 2025
TomoSelfDEQ: Self-Supervised Deep Equilibrium Learning for Sparse-Angle CT Reconstruction
T. Bubba
Matteo Santacesaria
Andrea Sebastiani
OOD
35
0
0
28 Feb 2025
Robust Confinement State Classification with Uncertainty Quantification through Ensembled Data-Driven Methods
Y. Poels
Cristina Venturini
Alessandro Pau
Olivier Sauter
Vlado Menkovski
TCV Team
WPTE team
46
1
0
24 Feb 2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
Nolan Dey
Gurpreet Gosal
Gavia Gray
Daria Soboleva
Joel Hestness
50
5
0
21 Feb 2025
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction
Xuyin Qi
Zeyu Zhang
Huazhan Zheng
Mingxi Chen
Numan Kutaiba
...
Hongtao Mao
Y. Li
Zhibin Liao
Yang Zhao
Minh Nguyen Nhat To
MedIm
46
7
0
02 Feb 2025
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
Hamed Firooz
Maziar Sanjabi
Adrian Englhardt
Aman Gupta
Ben Levine
...
Xiaoling Zhai
Ya Xu
Yu Wang
Yun Dai
Yun Dai
ALM
42
3
0
27 Jan 2025
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
26
0
0
27 Dec 2024
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes
Igor Sokolov
Peter Richtárik
72
1
0
22 Dec 2024
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao-quan Song
ODL
83
2
0
22 Dec 2024
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
98
5
0
25 Nov 2024
Optimizing Brain Tumor Segmentation with MedNeXt: BraTS 2024 SSA and Pediatrics
Sarim Hashmi
Juan Lugo
Abdelrahman Elsayed
Dinesh Saggurthi
Mohammed Elseiagy
Alikhan Nurkamal
Jaskaran Walia
F. Maani
Mohammad Yaqub
114
1
0
24 Nov 2024
CellPilot
Philipp Endres
Valentin Koch
Julia A. Schnabel
Carsten Marr
VLM
MedIm
59
0
0
23 Nov 2024
General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization
Kwangjun Ahn
Gagik Magakyan
Ashok Cutkosky
53
0
0
11 Nov 2024
Gradient Methods with Online Scaling
Wenzhi Gao
Ya-Chi Chu
Yinyu Ye
Madeleine Udell
20
1
0
04 Nov 2024
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
64
8
0
29 Oct 2024
Analyzing Generative Models by Manifold Entropic Metrics
Daniel Galperin
Ullrich Köthe
DRL
16
0
0
25 Oct 2024
TabDPT: Scaling Tabular Foundation Models
Junwei Ma
Valentin Thomas
Rasa Hosseinzadeh
Hamidreza Kamkari
Alex Labach
Jesse C. Cresswell
Keyvan Golestan
Guangwei Yu
M. Volkovs
Anthony L. Caterini
LMTD
32
3
0
23 Oct 2024
Foundation Models for Rapid Autonomy Validation
Alec Farid
Peter Schleede
Aaron Huang
Christoffer Heckman
32
0
0
22 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
39
15
0
15 Oct 2024
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
22
0
0
08 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
59
23
0
17 Sep 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
32
2
0
05 Jul 2024
Clinically inspired enhance Explainability and Interpretability of an AI-Tool for BCC diagnosis based on expert annotation
Iván Matas
Carmen Serrano
Francisca Silva
Amalia Serrano
Tomás Toledo-Pastrana
B. Acha
18
0
0
27 Jun 2024
Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang
Congliang Chen
Ziniu Li
Tian Ding
Chenwei Wu
Yinyu Ye
Zhi-Quan Luo
Ruoyu Sun
34
33
0
24 Jun 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
36
33
0
28 May 2024
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
31
5
0
28 May 2024
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
247
36,237
0
25 Aug 2016
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
282
39,170
0
01 Sep 2014
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
111
259
0
10 Dec 2012
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
99
571
0
08 Dec 2012
1