ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
ParaGAN: A Scalable Distributed Training Framework for Generative
  Adversarial Networks
ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks
Ziji Shi
Jialin Li
Yang You
54
1
0
06 Nov 2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Atli Kosson
Bettina Messmer
Martin Jaggi
AI4CE
71
5
0
31 Oct 2024
Consistency Diffusion Bridge Models
Consistency Diffusion Bridge Models
Guande He
Kaiwen Zheng
Jianfei Chen
Fan Bao
Jun-Jie Zhu
DiffM
124
5
0
30 Oct 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
52
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
482
0
0
29 Oct 2024
Contrastive Learning with Auxiliary User Detection for Identifying
  Activities
Contrastive Learning with Auxiliary User Detection for Identifying Activities
Wen Ge
Guanyi Mou
Emmanuel O. Agu
Kyumin Lee
AAML
65
0
0
21 Oct 2024
Fully Explicit Dynamic Gaussian Splatting
Fully Explicit Dynamic Gaussian Splatting
Junoh Lee
Chang-Yeon Won
Hyunjun Jung
Inhwan Bae
Hae-Gon Jeon
3DGS
108
12
0
21 Oct 2024
Generalized Probabilistic Attention Mechanism in Transformers
Generalized Probabilistic Attention Mechanism in Transformers
DongNyeong Heo
Heeyoul Choi
98
1
0
21 Oct 2024
Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in
  Dynamical Systems Reconstruction
Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction
Manuel Brenner
Christoph Jürgen Hemmer
Zahra Monfared
Daniel Durstewitz
AI4CE
80
4
0
18 Oct 2024
A Mirror Descent Perspective of Smoothed Sign Descent
A Mirror Descent Perspective of Smoothed Sign Descent
Shuyang Wang
Diego Klabjan
75
1
0
18 Oct 2024
Heterogeneous Graph Generation: A Hierarchical Approach using Node
  Feature Pooling
Heterogeneous Graph Generation: A Hierarchical Approach using Node Feature Pooling
Hritaban Ghosh
Chen Changyu
Arunesh Sinha
Shamik Sural
62
0
0
15 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
141
7
0
14 Oct 2024
Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
Junyi Cao
Shanyan Guan
Yanhao Ge
Wei Li
Xiaokang Yang
Chao Ma
AI4CE
87
8
0
10 Oct 2024
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation
  Learning
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
Siyuan Li
Juanxi Tian
Zedong Wang
Luyuan Zhang
Zicheng Liu
Weiyang Jin
Yang Liu
Baigui Sun
Stan Z. Li
95
0
0
08 Oct 2024
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense
  Question Answering
ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering
Francesco Maria Molfese
Simone Conia
Riccardo Orlando
Roberto Navigli
ReLMLRMRALM
65
3
0
07 Oct 2024
Computational design of target-specific linear peptide binders with
  TransformerBeta
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
77
0
0
07 Oct 2024
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Manuel Brenner
Elias Weber
G. Koppe
Daniel Durstewitz
AI4TSAI4CE
118
8
0
07 Oct 2024
Image First or Text First? Optimising the Sequencing of Modalities in
  Large Language Model Prompting and Reasoning Tasks
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
108
6
0
04 Oct 2024
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank
  Constraint?
Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
Xi Chen
Kaituo Feng
Changsheng Li
Xunhao Lai
Xiangyu Yue
Ye Yuan
Guoren Wang
94
15
0
02 Oct 2024
Question-guided Knowledge Graph Re-scoring and Injection for Knowledge
  Graph Question Answering
Question-guided Knowledge Graph Re-scoring and Injection for Knowledge Graph Question Answering
Yu Zhang
Kehai Chen
Xuefeng Bai
zhao kang
Quanjiang Guo
Min Zhang
120
12
0
02 Oct 2024
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from
  Single Images
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen
Ahmet Berke Gokmen
Aysegül Dündar
84
2
0
30 Sep 2024
Simple and Fast Distillation of Diffusion Models
Simple and Fast Distillation of Diffusion Models
Zhenyu Zhou
Defang Chen
Can Wang
Chun Chen
Siwei Lyu
DiffM
75
8
0
29 Sep 2024
Deep Heterogeneous Contrastive Hyper-Graph Learning for In-the-Wild
  Context-Aware Human Activity Recognition
Deep Heterogeneous Contrastive Hyper-Graph Learning for In-the-Wild Context-Aware Human Activity Recognition
Wen Ge
Guanyi Mou
Emmanuel O. Agu
Kyumin Lee
HAIBDL
62
6
0
27 Sep 2024
Tackling fluffy clouds: robust field boundary delineation across global agricultural landscapes with Sentinel-1 and Sentinel-2 Time Series
Tackling fluffy clouds: robust field boundary delineation across global agricultural landscapes with Sentinel-1 and Sentinel-2 Time Series
F. Diakogiannis
Zheng-Shu Zhou
Jeff Wang
Gonzalo Mata
Dave Henry
...
Jonathan Richetti
Kathryn Batchelor
Chris Herrmann
Andrew Toovey
John Taylor
77
0
0
20 Sep 2024
Convergence of Sharpness-Aware Minimization Algorithms using Increasing
  Batch Size and Decaying Learning Rate
Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate
Hinata Harada
Hideaki Iiduka
59
1
0
16 Sep 2024
Stable Language Model Pre-training by Reducing Embedding Variability
Stable Language Model Pre-training by Reducing Embedding Variability
Woojin Chung
Jiwoo Hong
Na Min An
James Thorne
Se-Young Yun
78
3
0
12 Sep 2024
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Gentiana Rashiti
G. Karunaratne
Mrinmaya Sachan
Abu Sebastian
Abbas Rahimi
RALM
230
0
0
12 Sep 2024
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent
  Algorithm
Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm
Jinwei Zhao
Marco Gori
Alessandro Betti
S. Melacci
Hongtao Zhang
Jiedong Liu
Xinhong Hei
86
0
0
10 Sep 2024
Reducing Bias in Deep Learning Optimization: The RSGDM Approach
Reducing Bias in Deep Learning Optimization: The RSGDM Approach
Honglin Qin
Hongye Zheng
Bingxing Wang
Zhizhong Wu
Bingyao Liu
Yuanfang Yang
57
8
0
05 Sep 2024
Does Data-Efficient Generalization Exacerbate Bias in Foundation Models?
Does Data-Efficient Generalization Exacerbate Bias in Foundation Models?
Dilermando Queiroz
Anderson Carlos
Maíra Fatoretto
Luis Filipe Nakayama
André Anjos
Lilian Berton
138
0
0
28 Aug 2024
Bidirectional Awareness Induction in Autoregressive Seq2Seq Models
Bidirectional Awareness Induction in Autoregressive Seq2Seq Models
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
BDL
95
0
0
25 Aug 2024
Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics
  Fall In!
Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
Stefano Perrella
Lorenzo Proietti
Alessandro Sciré
Edoardo Barba
Roberto Navigli
85
4
0
25 Aug 2024
cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control
  Design for a Morphing Quadrotor
cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor
Tao Yang
Huai-Ning Wu
Jun-Wei Wang
152
0
0
23 Aug 2024
Scaling Law with Learning Rate Annealing
Scaling Law with Learning Rate Annealing
Howe Tissue
Venus Wang
Lu Wang
108
9
0
20 Aug 2024
NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance
  Fields in the Wild
NeRF-US: Removing Ultrasound Imaging Artifacts from Neural Radiance Fields in the Wild
Rishit Dagli
Atsuhiro Hibi
R. Krishnan
Pascal N Tyrrell
96
6
0
13 Aug 2024
Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors
Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors
Andrei Catalin Coman
Christos Theodoropoulos
Marie-Francine Moens
James Henderson
88
0
0
13 Aug 2024
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Music2Latent: Consistency Autoencoders for Latent Audio Compression
Marco Pasini
Stefan Lattner
George Fazekas
70
8
0
12 Aug 2024
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen
  Languages
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages
Carlos Mullov
Ngoc-Quan Pham
Alexander Waibel
97
1
0
05 Aug 2024
DeMansia: Mamba Never Forgets Any Tokens
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
Mamba
48
0
0
04 Aug 2024
Deep Learning Framework for History Matching CO2 Storage with 4D Seismic and Monitoring Well Data
Deep Learning Framework for History Matching CO2 Storage with 4D Seismic and Monitoring Well Data
Ekta U. Samani
A. Banerjee
103
0
0
02 Aug 2024
What comes after transformers? -- A selective survey connecting ideas in
  deep learning
What comes after transformers? -- A selective survey connecting ideas in deep learning
Johannes Schneider
AI4CE
112
2
0
01 Aug 2024
ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget
ReLiK: Retrieve and LinK, Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget
Adam Gould
Pere-Lluis Huguet-Cabot
S. Dadhania
Francesca Toni
180
9
0
31 Jul 2024
dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from
  Multiple Humans
dopanim: A Dataset of Doppelganger Animals with Noisy Annotations from Multiple Humans
M. Herde
Denis Huseljic
Lukas Rauch
Bernhard Sick
85
1
0
30 Jul 2024
No learning rates needed: Introducing SALSA -- Stable Armijo Line Search
  Adaptation
No learning rates needed: Introducing SALSA -- Stable Armijo Line Search Adaptation
Philip Kenneweg
Tristan Kenneweg
Fabian Fumagalli
Barbara Hammer
ODL
59
0
0
30 Jul 2024
Robust VAEs via Generating Process of Noise Augmented Data
Robust VAEs via Generating Process of Noise Augmented Data
Hiroo Irobe
Wataru Aoki
Kimihiro Yamazaki
Yuhui Zhang
Takumi Nakagawa
Hiroki Waida
Yuichiro Wada
Takafumi Kanamori
AAML
56
0
0
26 Jul 2024
Amortized Active Learning for Nonparametric Functions
Amortized Active Learning for Nonparametric Functions
Cen-You Li
Marc Toussaint
Barbara Rakitsch
Christoph Zimmer
53
0
0
25 Jul 2024
Lymphoid Infiltration Assessment of the Tumor Margins in H&E Slides
Lymphoid Infiltration Assessment of the Tumor Margins in H&E Slides
Zhuxian Guo
Amine Marzouki
Jean-François Emile
Henning Muller
Camille Kurtz
Nicolas Loménie
33
0
0
23 Jul 2024
Monocular pose estimation of articulated surgical instruments in open
  surgery
Monocular pose estimation of articulated surgical instruments in open surgery
Robert Spektor
Tom Friedman
Itay Or
Gil Bolotin
S. Laufer
93
0
0
16 Jul 2024
Finding Meaning in Points: Weakly Supervised Semantic Segmentation for
  Event Cameras
Finding Meaning in Points: Weakly Supervised Semantic Segmentation for Event Cameras
Hoonhee Cho
Sung-Hoon Yoon
H. Kweon
Kuk-Jin Yoon
77
2
0
15 Jul 2024
Sensorimotor Attention and Language-based Regressions in Shared Latent
  Variables for Integrating Robot Motion Learning and LLM
Sensorimotor Attention and Language-based Regressions in Shared Latent Variables for Integrating Robot Motion Learning and LLM
Kanata Suzuki
Tetsuya Ogata
82
2
0
12 Jul 2024
Previous
12345...161718
Next