ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.15682
  4. Cited By
The Road Less Scheduled

The Road Less Scheduled

24 May 2024
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
ArXivPDFHTML

Papers citing "The Road Less Scheduled"

43 / 43 papers shown
Title
Deep Learning-Based Robust Optical Guidance for Hypersonic Platforms
Deep Learning-Based Robust Optical Guidance for Hypersonic Platforms
Adrien Chan-Hon-Tong
A. Plyer
Baptiste Cadalen
Laurent Serre
19
0
0
09 May 2025
Plasma State Monitoring and Disruption Characterization using Multimodal VAEs
Plasma State Monitoring and Disruption Characterization using Multimodal VAEs
Y. Poels
Alessandro Pau
Christian Donner
Giulio Romanelli
Olivier Sauter
Cristina Venturini
Vlado Menkovski
TCV Team
WPTE team
24
0
0
24 Apr 2025
Tabular foundation model to detect empathy from visual cues
Tabular foundation model to detect empathy from visual cues
M. Hasan
Shafin Rahman
M. Hossain
Aneesh Krishna
Tom Gedeon
VLM
21
0
0
15 Apr 2025
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications
S-EO: A Large-Scale Dataset for Geometry-Aware Shadow Detection in Remote Sensing Applications
Masquil Elías
Marí Roger
Ehret Thibaud
Meinhardt-Llopis Enric
Musé Pablo
Facciolo Gabriele
MDE
24
0
0
09 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
59
0
0
02 Apr 2025
Why risk matters for protein binder design
Why risk matters for protein binder design
Tudor-Stefan Cotet
Igor Krawczuk
41
0
0
31 Mar 2025
GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain
GAITGen: Disentangled Motion-Pathology Impaired Gait Generative Model -- Bringing Motion Generation to the Clinical Domain
Vida Adeli
Soroush Mehraban
Majid Mirmehdi
Alan Whone
Benjamin Filtjens
Amirhossein Dadashzadeh
A. Fasano
Andrea Iaboni Babak Taati
46
0
0
28 Mar 2025
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
Leander Kurscheidt
Paolo Morettin
Roberto Sebastiani
Andrea Passerini
Antonio Vergari
55
0
0
25 Mar 2025
MaskSDM with Shapley values to improve flexibility, robustness, and explainability in species distribution modeling
MaskSDM with Shapley values to improve flexibility, robustness, and explainability in species distribution modeling
Robin Zbinden
Nina Van Tiel
Gencer Sumbul
Chiara Vanalli
B. Kellenberger
D. Tuia
39
0
0
17 Mar 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
59
1
0
17 Mar 2025
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation
Improving Diffusion-based Inverse Algorithms under Few-Step Constraint via Learnable Linear Extrapolation
Jiawei Zhang
Ziyuan Liu
Leon Yan
Gen Li
Yuantao Gu
54
0
0
13 Mar 2025
Benefits of Learning Rate Annealing for Tuning-Robustness in Stochastic Optimization
Amit Attia
Tomer Koren
59
1
0
13 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Paul Janson
Vaibhav Singh
Paria Mehrbod
Adam Ibrahim
Irina Rish
Eugene Belilovsky
Benjamin Thérien
CLL
73
0
0
04 Mar 2025
TomoSelfDEQ: Self-Supervised Deep Equilibrium Learning for Sparse-Angle CT Reconstruction
TomoSelfDEQ: Self-Supervised Deep Equilibrium Learning for Sparse-Angle CT Reconstruction
T. Bubba
Matteo Santacesaria
Andrea Sebastiani
OOD
35
0
0
28 Feb 2025
Robust Confinement State Classification with Uncertainty Quantification through Ensembled Data-Driven Methods
Robust Confinement State Classification with Uncertainty Quantification through Ensembled Data-Driven Methods
Y. Poels
Cristina Venturini
Alessandro Pau
Olivier Sauter
Vlado Menkovski
TCV Team
WPTE team
48
1
0
24 Feb 2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
Nolan Dey
Gurpreet Gosal
Gavia Gray
Daria Soboleva
Joel Hestness
50
5
0
21 Feb 2025
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction
MedConv: Convolutions Beat Transformers on Long-Tailed Bone Density Prediction
Xuyin Qi
Zeyu Zhang
Huazhan Zheng
Mingxi Chen
Numan Kutaiba
...
Hongtao Mao
Y. Li
Zhibin Liao
Yang Zhao
Minh Nguyen Nhat To
MedIm
46
7
0
02 Feb 2025
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation
Hamed Firooz
Maziar Sanjabi
Adrian Englhardt
Aman Gupta
Ben Levine
...
Xiaoling Zhai
Ya Xu
Yu Wang
Yun Dai
Yun Dai
ALM
42
3
0
27 Jan 2025
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation
  by Learning and Enforcing Temporal Constraints
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints
Alberto Maté
Mariella Dimiccoli
AI4TS
26
0
0
27 Dec 2024
MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes
Igor Sokolov
Peter Richtárik
72
1
0
22 Dec 2024
Grams: Gradient Descent with Adaptive Momentum Scaling
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao-quan Song
ODL
83
2
0
22 Dec 2024
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
98
5
0
25 Nov 2024
Optimizing Brain Tumor Segmentation with MedNeXt: BraTS 2024 SSA and
  Pediatrics
Optimizing Brain Tumor Segmentation with MedNeXt: BraTS 2024 SSA and Pediatrics
Sarim Hashmi
Juan Lugo
Abdelrahman Elsayed
Dinesh Saggurthi
Mohammed Elseiagy
Alikhan Nurkamal
Jaskaran Walia
F. Maani
Mohammad Yaqub
117
1
0
24 Nov 2024
CellPilot
CellPilot
Philipp Endres
Valentin Koch
Julia A. Schnabel
Carsten Marr
VLM
MedIm
61
0
0
23 Nov 2024
General framework for online-to-nonconvex conversion: Schedule-free SGD
  is also effective for nonconvex optimization
General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization
Kwangjun Ahn
Gagik Magakyan
Ashok Cutkosky
56
0
0
11 Nov 2024
Gradient Methods with Online Scaling
Gradient Methods with Online Scaling
Wenzhi Gao
Ya-Chi Chu
Yinyu Ye
Madeleine Udell
20
1
0
04 Nov 2024
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
64
8
0
29 Oct 2024
Analyzing Generative Models by Manifold Entropic Metrics
Analyzing Generative Models by Manifold Entropic Metrics
Daniel Galperin
Ullrich Köthe
DRL
16
0
0
25 Oct 2024
TabDPT: Scaling Tabular Foundation Models
TabDPT: Scaling Tabular Foundation Models
Junwei Ma
Valentin Thomas
Rasa Hosseinzadeh
Hamidreza Kamkari
Alex Labach
Jesse C. Cresswell
Keyvan Golestan
Guangwei Yu
M. Volkovs
Anthony L. Caterini
LMTD
32
3
0
23 Oct 2024
Foundation Models for Rapid Autonomy Validation
Foundation Models for Rapid Autonomy Validation
Alec Farid
Peter Schleede
Aaron Huang
Christoffer Heckman
32
0
0
22 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
39
15
0
15 Oct 2024
A second-order-like optimizer with adaptive gradient scaling for deep
  learning
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
25
0
0
08 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
59
23
0
17 Sep 2024
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton
  Stepsizes
An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto
Lin Xiao
32
2
0
05 Jul 2024
Clinically inspired enhance Explainability and Interpretability of an AI-Tool for BCC diagnosis based on expert annotation
Clinically inspired enhance Explainability and Interpretability of an AI-Tool for BCC diagnosis based on expert annotation
Iván Matas
Carmen Serrano
Francisca Silva
Amalia Serrano
Tomás Toledo-Pastrana
B. Acha
18
0
0
27 Jun 2024
Adam-mini: Use Fewer Learning Rates To Gain More
Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang
Congliang Chen
Ziniu Li
Tian Ding
Chenwei Wu
Yinyu Ye
Zhi-Quan Luo
Ruoyu Sun
34
33
0
24 Jun 2024
Scaling Laws and Compute-Optimal Training Beyond Fixed Training
  Durations
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
Elie Bakouch
Atli Kosson
Loubna Ben Allal
Leandro von Werra
Martin Jaggi
36
33
0
28 May 2024
4-bit Shampoo for Memory-Efficient Network Training
4-bit Shampoo for Memory-Efficient Network Training
Sike Wang
Jia Li
Pan Zhou
Hua Huang
MQ
31
5
0
28 May 2024
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
247
36,237
0
25 Aug 2016
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
282
39,170
0
01 Sep 2014
A simpler approach to obtaining an O(1/t) convergence rate for the
  projected stochastic subgradient method
A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method
Simon Lacoste-Julien
Mark W. Schmidt
Francis R. Bach
113
259
0
10 Dec 2012
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
99
571
0
08 Dec 2012
1