ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
Motion Puzzle: Arbitrary Motion Style Transfer by Body Part
Motion Puzzle: Arbitrary Motion Style Transfer by Body Part
Deok-Kyeong Jang
S. Park
Sung-Hee Lee
3DH
70
60
0
10 Feb 2022
Particle Transformer for Jet Tagging
Particle Transformer for Jet Tagging
H. Qu
Congqiao Li
Sitian Qian
ViTMedIm
85
106
0
08 Feb 2022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for
  Training Large Transformer Models
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
Chen Liang
Haoming Jiang
Simiao Zuo
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
T. Zhao
72
14
0
06 Feb 2022
Boundary-aware Information Maximization for Self-supervised Medical
  Image Segmentation
Boundary-aware Information Maximization for Self-supervised Medical Image Segmentation
Jizong Peng
Ping Wang
M. Pedersoli
Christian Desrosiers
SSL
77
6
0
04 Feb 2022
Global Optimization Networks
Global Optimization Networks
Sen Zhao
Erez Louidor Ilan
Oleksandr Mangylov
Maya R. Gupta
111
6
0
02 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
99
10
0
31 Jan 2022
A Stochastic Bundle Method for Interpolating Networks
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
76
4
0
29 Jan 2022
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection
  and Action Recognition Dataset
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset
Karthik Sivarama Krishnan
Koushik Sivarama Krishnan
48
5
0
28 Jan 2022
Learning to Minimize the Remainder in Supervised Learning
Learning to Minimize the Remainder in Supervised Learning
Yan Luo
Yongkang Wong
Mohan S. Kankanhalli
Qi Zhao
97
1
0
23 Jan 2022
AdaTerm: Adaptive T-Distribution Estimated Robust Moments for
  Noise-Robust Stochastic Gradient Optimization
AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization
Wendyam Eric Lionel Ilboudo
Taisuke Kobayashi
Takamitsu Matsubara
89
13
0
18 Jan 2022
Generalization in Supervised Learning Through Riemannian Contraction
Generalization in Supervised Learning Through Riemannian Contraction
L. Kozachkov
Patrick M. Wensing
Jean-Jacques E. Slotine
MLT
91
9
0
17 Jan 2022
Data-Efficient Information Extraction from Form-Like Documents
Data-Efficient Information Extraction from Form-Like Documents
Beliz Gunel
Navneet Potti
Sandeep Tata
James Bradley Wendt
Marc Najork
Jing Xie
48
2
0
07 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries
Sign Language Video Retrieval with Free-Form Textual Queries
A. Duarte
Samuel Albanie
Xavier Giró-i-Nieto
Gül Varol
SLR
88
29
0
07 Jan 2022
Including STDP to eligibility propagation in multi-layer recurrent
  spiking neural networks
Including STDP to eligibility propagation in multi-layer recurrent spiking neural networks
Werner van der Veen
77
1
0
05 Jan 2022
Class-Incremental Continual Learning into the eXtended DER-verse
Class-Incremental Continual Learning into the eXtended DER-verse
Matteo Boschini
Lorenzo Bonicelli
Pietro Buzzega
Angelo Porrello
Simone Calderara
CLLBDL
109
142
0
03 Jan 2022
PointCaps: Raw Point Cloud Processing using Capsule Networks with
  Euclidean Distance Routing
PointCaps: Raw Point Cloud Processing using Capsule Networks with Euclidean Distance Routing
Dishanika Denipitiyage
Vinoj Jayasundara
Ranga Rodrigo
Chamira U. S. Edussooriya
3DPC
62
6
0
21 Dec 2021
Audio Retrieval with Natural Language Queries: A Benchmark Study
Audio Retrieval with Natural Language Queries: A Benchmark Study
A. Sophia Koepke
Andreea-Maria Oncescu
João F. Henriques
Zeynep Akata
Samuel Albanie
78
102
0
17 Dec 2021
Improving Unsupervised Stain-To-Stain Translation using Self-Supervision
  and Meta-Learning
Improving Unsupervised Stain-To-Stain Translation using Self-Supervision and Meta-Learning
Nassim Bouteldja
B. Klinkhammer
Tarek Schlaich
P. Boor
Dorit Merhof
MedIm
48
21
0
16 Dec 2021
Self-Supervised Bot Play for Conversational Recommendation with
  Justifications
Self-Supervised Bot Play for Conversational Recommendation with Justifications
Shuyang Li
Bodhisattwa Prasad Majumder
Julian McAuley
86
7
0
09 Dec 2021
Information is Power: Intrinsic Control via Information Capture
Information is Power: Intrinsic Control via Information Capture
Nick Rhinehart
Jenny Wang
Glen Berseth
John D. Co-Reyes
Danijar Hafner
Chelsea Finn
Sergey Levine
60
9
0
07 Dec 2021
In-flight Novelty Detection with Convolutional Neural Networks
In-flight Novelty Detection with Convolutional Neural Networks
A. Hartwell
Felipe J. Montana
William R. Jacobs
V. Kadirkamanathan
A. Mills
Tom S. Clark
48
6
0
07 Dec 2021
More layers! End-to-end regression and uncertainty on tabular data with
  deep learning
More layers! End-to-end regression and uncertainty on tabular data with deep learning
Ivan Bondarenko
OODLMTDUQCV
55
4
0
07 Dec 2021
A Novel Convergence Analysis for Algorithms of the Adam Family
A Novel Convergence Analysis for Algorithms of the Adam Family
Zhishuai Guo
Yi Tian Xu
W. Yin
Rong Jin
Tianbao Yang
88
49
0
07 Dec 2021
JointLK: Joint Reasoning with Language Models and Knowledge Graphs for
  Commonsense Question Answering
JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering
Yueqing Sun
Qi Shi
Le Qi
Yu Zhang
RALMLRM
89
72
0
06 Dec 2021
HyperInverter: Improving StyleGAN Inversion via Hypernetwork
HyperInverter: Improving StyleGAN Inversion via Hypernetwork
Tan M. Dinh
Anh Tran
Rang Nguyen
Binh-Son Hua
81
119
0
01 Dec 2021
Environmental Sound Extraction Using Onomatopoeic Words
Environmental Sound Extraction Using Onomatopoeic Words
Yuki Okamoto
Shota Horiguchi
Masaaki Yamamoto
Keisuke Imoto
Yohei Kawaguchi
69
9
0
01 Dec 2021
Adaptive Optimization with Examplewise Gradients
Adaptive Optimization with Examplewise Gradients
Julius Kunze
James Townsend
David Barber
ODL
46
0
0
30 Nov 2021
DAFormer: Improving Network Architectures and Training Strategies for
  Domain-Adaptive Semantic Segmentation
DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
Lukas Hoyer
Dengxin Dai
Luc Van Gool
AI4CE
107
462
0
29 Nov 2021
Buildings Classification using Very High Resolution Satellite Imagery
Buildings Classification using Very High Resolution Satellite Imagery
Mohammad Dimassi
A. Samhat
Mohammad Zaraket
Jamal Haydar
Mustafa Shukor
A. Ghandour
37
3
0
29 Nov 2021
Rethinking Generic Camera Models for Deep Single Image Camera
  Calibration to Recover Rotation and Fisheye Distortion
Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion
Nobuhiko Wakai
Satoshi Sato
Yasunori Ishii
Takayoshi Yamashita
74
8
0
25 Nov 2021
Rethinking the modeling of the instrumental response of telescopes with
  a differentiable optical model
Rethinking the modeling of the instrumental response of telescopes with a differentiable optical model
T. Liaudat
Jean-Luc Starck
M. Kilbinger
P. Frugier
52
9
0
24 Nov 2021
Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks
Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks
Ángel López García-Arias
Masanori Hashimoto
Masato Motomura
Jaehoon Yu
66
5
0
24 Nov 2021
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Shota Orihashi
Yoshihiro Yamazaki
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Ryo Masumura
41
0
0
22 Nov 2021
Capitalization and Punctuation Restoration: a Survey
Capitalization and Punctuation Restoration: a Survey
V. Pais
D. Tufis
80
19
0
21 Nov 2021
Diversified Multi-prototype Representation for Semi-supervised
  Segmentation
Diversified Multi-prototype Representation for Semi-supervised Segmentation
Jizong Peng
Christian Desrosiers
M. Pedersoli
57
1
0
16 Nov 2021
Deep Network Approximation in Terms of Intrinsic Parameters
Deep Network Approximation in Terms of Intrinsic Parameters
Zuowei Shen
Haizhao Yang
Shijun Zhang
64
9
0
15 Nov 2021
CoreLM: Coreference-aware Language Model Fine-Tuning
CoreLM: Coreference-aware Language Model Fine-Tuning
Nikolaos Stylianou
I. Vlahavas
60
2
0
04 Nov 2021
Conformal prediction for text infilling and part-of-speech prediction
Conformal prediction for text infilling and part-of-speech prediction
N. Dey
Jing Ding
Jack G. Ferrell
Carolina Kapper
Maxwell Lovig
Emiliano Planchon
Jonathan P. Williams
UQLM
140
21
0
04 Nov 2021
LogAvgExp Provides a Principled and Performant Global Pooling Operator
LogAvgExp Provides a Principled and Performant Global Pooling Operator
S. Lowe
Thomas Trappenberg
Sageev Oore
FAtt
53
2
0
02 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
90
15
0
01 Nov 2021
Fully convolutional Siamese neural networks for buildings damage
  assessment from satellite images
Fully convolutional Siamese neural networks for buildings damage assessment from satellite images
Eugene Khvedchenya
Tatiana Gabruseva
43
9
0
31 Oct 2021
Whole Brain Segmentation with Full Volume Neural Network
Whole Brain Segmentation with Full Volume Neural Network
Yeshu Li
Jianwei Cui
Yilun Sheng
Xiao Liang
Jingdong Wang
E. Chang
Yan Xu
146
11
0
29 Oct 2021
OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D
  Medical Data
OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data
Christoph Reich
Tim Prangemeier
Ozdemir Cetin
Heinz Koeppl
68
12
0
20 Oct 2021
Training Deep Neural Networks with Adaptive Momentum Inspired by the
  Quadratic Optimization
Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization
Tao Sun
Huaming Ling
Zuoqiang Shi
Dongsheng Li
Bao Wang
ODL
65
13
0
18 Oct 2021
Hierarchical Curriculum Learning for AMR Parsing
Hierarchical Curriculum Learning for AMR Parsing
Peiyi Wang
Liang Chen
Tianyu Liu
Damai Dai
Yunbo Cao
Baobao Chang
Zhifang Sui
113
15
0
15 Oct 2021
Dynamic Inference with Neural Interpreters
Dynamic Inference with Neural Interpreters
Nasim Rahaman
Muhammad Waleed Gondal
S. Joshi
Peter V. Gehler
Yoshua Bengio
Francesco Locatello
Bernhard Schölkopf
105
31
0
12 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs
LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Xiaohui Wang
Yang Wei
Ying Xiong
Guyue Huang
Xian Qian
Yufei Ding
Mingxuan Wang
Lei Li
VLM
62
33
0
12 Oct 2021
Momentum Centering and Asynchronous Update for Adaptive Gradient Methods
Momentum Centering and Asynchronous Update for Adaptive Gradient Methods
Juntang Zhuang
Yifan Ding
Tommy M. Tang
Nicha Dvornek
S. Tatikonda
James S. Duncan
ODL
55
4
0
11 Oct 2021
Vision Transformer based COVID-19 Detection using Chest X-rays
Vision Transformer based COVID-19 Detection using Chest X-rays
Koushik Sivarama Krishnan
Karthik Sivarama Krishnan
ViTMedIm
68
57
0
09 Oct 2021
Taming Sparsely Activated Transformer with Stochastic Experts
Taming Sparsely Activated Transformer with Stochastic Experts
Simiao Zuo
Xiaodong Liu
Jian Jiao
Young Jin Kim
Hany Hassan
Ruofei Zhang
T. Zhao
Jianfeng Gao
MoE
123
115
0
08 Oct 2021
Previous
123...91011...161718
Next