ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.03265
  4. Cited By
On the Variance of the Adaptive Learning Rate and Beyond
v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
    ODL
ArXiv (abs)PDFHTMLGithub (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown
Title
SING: A Plug-and-Play DNN Learning Technique
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
68
0
0
25 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
92
17
0
21 May 2023
Human-annotated label noise and their impact on ConvNets for remote
  sensing image scene classification
Human-annotated label noise and their impact on ConvNets for remote sensing image scene classification
Long Peng
T. Wei
Xuehong Chen
Xiaobei Chen
Rui Sun
L. Wan
Jin Chen
Xiaolin Zhu
NoLa
42
3
0
20 May 2023
$\partial\mathbb{B}$ nets: learning discrete functions by gradient
  descent
∂B\partial\mathbb{B}∂B nets: learning discrete functions by gradient descent
Ian Wright
126
0
0
12 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for
  Speech-to-Text Tasks
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Yun Tang
Anna Y. Sun
Hirofumi Inaguma
Xinyue Chen
Ning Dong
Xutai Ma
Paden Tomasello
J. Pino
108
22
0
04 May 2023
BranchNorm: Robustly Scaling Extremely Deep Transformers
BranchNorm: Robustly Scaling Extremely Deep Transformers
Yanjun Liu
Xianfeng Zeng
Fandong Meng
Jie Zhou
77
3
0
04 May 2023
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular
  Level
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular Level
V. Sydorskyi
Igor Krashenyi
Denis Savka
Oleksandr Zarichkovyi
41
1
0
03 May 2023
Environmental sound synthesis from vocal imitations and sound event
  labels
Environmental sound synthesis from vocal imitations and sound event labels
Yuki Okamoto
Keisuke Imoto
Shinnosuke Takamichi
Ryotaro Nagase
Takahiro Fukumori
Y. Yamashita
42
0
0
29 Apr 2023
Domain Adaptive and Generalizable Network Architectures and Training
  Strategies for Semantic Image Segmentation
Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation
Lukas Hoyer
Dengxin Dai
Luc Van Gool
AI4CEOOD
131
26
0
26 Apr 2023
Learning imaging mechanism directly from optical microscopy observations
Learning imaging mechanism directly from optical microscopy observations
Ze-Hao Wang
Long-Kun Shan
Tong-Tian Weng
Tianrun Chen
Qiyuan Wang
Xiang-Dong Chen
Zhang Wang
Guanghsheng Guo
Hefei 230088
DiffM
28
1
0
25 Apr 2023
Universal Adversarial Backdoor Attacks to Fool Vertical Federated
  Learning in Cloud-Edge Collaboration
Universal Adversarial Backdoor Attacks to Fool Vertical Federated Learning in Cloud-Edge Collaboration
Peng Chen
Xin Du
Zhihui Lu
Hongfeng Chai
FedMLAAML
95
11
0
22 Apr 2023
Angle based dynamic learning rate for gradient descent
Angle based dynamic learning rate for gradient descent
Neel Mishra
Kiran Ravish
ODL
69
1
0
20 Apr 2023
Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra
Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra
Jonáš Kulhánek
Torsten Sattler
109
51
0
19 Apr 2023
Bridging Discrete and Backpropagation: Straight-Through and Beyond
Bridging Discrete and Backpropagation: Straight-Through and Beyond
Liyuan Liu
Chengyu Dong
Xiaodong Liu
Bin Yu
Jianfeng Gao
BDL
89
23
0
17 Apr 2023
Modeling Dense Multimodal Interactions Between Biological Pathways and
  Histology for Survival Prediction
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction
Guillaume Jaume
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Paul Pu Liang
Faisal Mahmood
91
51
0
13 Apr 2023
CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model
CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model
Dingkang Liang
Jiahao Xie
Zhikang Zou
Xiaoqing Ye
Wei Xu
Xiang Bai
SSLCLIPVLM
109
57
0
09 Apr 2023
METransformer: Radiology Report Generation by Transformer with Multiple
  Learnable Expert Tokens
METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
Zhanyu Wang
Lingqiao Liu
Lei Wang
Luping Zhou
MedIm
77
76
0
05 Apr 2023
Revisiting Context Aggregation for Image Matting
Revisiting Context Aggregation for Image Matting
Qinglin Liu
Xiaoqian Lv
Quanling Meng
Zonglin Li
Xiangyuan Lan
Shuo Yang
Shengping Zhang
Liqiang Nie
88
5
0
03 Apr 2023
Astroformer: More Data Might not be all you need for Classification
Astroformer: More Data Might not be all you need for Classification
Rishit Dagli
107
8
0
03 Apr 2023
Devil is in the Queries: Advancing Mask Transformers for Real-world
  Medical Image Segmentation and Out-of-Distribution Localization
Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization
Mingze Yuan
Yingda Xia
Hexin Dong
Zi Chen
Jiawen Yao
...
Bin Dong
Jing Zhou
Le Lu
Ling Zhang
Li Zhang
OODMedIm
57
23
0
01 Apr 2023
Deep Single Image Camera Calibration by Heatmap Regression to Recover
  Fisheye Images Under Manhattan World Assumption
Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption
Nobuhiko Wakai
Satoshi Sato
Yasunori Ishii
Takayoshi Yamashita
85
5
0
30 Mar 2023
Exploring Deep Learning Methods for Classification of SAR Images:
  Towards NextGen Convolutions via Transformers
Exploring Deep Learning Methods for Classification of SAR Images: Towards NextGen Convolutions via Transformers
Ashutosh Kumar Singh
Vivek Kumar Singh
26
0
0
28 Mar 2023
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose
  Estimation
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation
Linfang Zheng
Chen Wang
Ying Sun
Esha Dasgupta
Hua Chen
A. Leonardis
Wei Zhang
H. Chang
3DPC
97
44
0
28 Mar 2023
AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer
  Learning
AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning
Guoxian Song
Hongyi Xu
Jing Liu
Tiancheng Zhi
Yichun Shi
Jianfeng Zhang
Zihang Jiang
Jiashi Feng
S. Sang
Linjie Luo
3DH
55
6
0
24 Mar 2023
Toward Open-domain Slot Filling via Self-supervised Co-training
Toward Open-domain Slot Filling via Self-supervised Co-training
A. Mosharrof
Moghis Fereidouni
A.B. Siddique
50
1
0
24 Mar 2023
TriPlaneNet: An Encoder for EG3D Inversion
TriPlaneNet: An Encoder for EG3D Inversion
A. Bhattarai
Matthias Nießner
Artem Sevastopolsky
85
35
0
23 Mar 2023
A Survey of Historical Learning: Learning Models with Learning History
A Survey of Historical Learning: Learning Models with Learning History
Xiang Li
Ge Wu
Lingfeng Yang
Wenzhe Wang
Renjie Song
Jian Yang
MUAI4TS
103
2
0
23 Mar 2023
Unsupervised Domain Adaptation for Training Event-Based Networks Using
  Contrastive Learning and Uncorrelated Conditioning
Unsupervised Domain Adaptation for Training Event-Based Networks Using Contrastive Learning and Uncorrelated Conditioning
Dayuan Jian
Mohammad Rostami
83
14
0
22 Mar 2023
Make Encoder Great Again in 3D GAN Inversion through Geometry and
  Occlusion-Aware Encoding
Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding
Ziyang Yuan
Yiming Zhu
Yu Li
Hongyu Liu
Chun Yuan
3DV
64
37
0
22 Mar 2023
Picture that Sketch: Photorealistic Image Generation from Abstract
  Sketches
Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
Subhadeep Koley
A. Bhunia
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
3DH
109
35
0
20 Mar 2023
Transformer Models for Type Inference in the Simply Typed Lambda
  Calculus: A Case Study in Deep Learning for Code
Transformer Models for Type Inference in the Simply Typed Lambda Calculus: A Case Study in Deep Learning for Code
Brando Miranda
Avraham Shinnar
V. Pestun
B. Trager
37
3
0
15 Mar 2023
SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction
SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction
A. A. Nargund
Misha Sra
ViT
90
2
0
11 Mar 2023
EfficientTempNet: Temporal Super-Resolution of Radar Rainfall
EfficientTempNet: Temporal Super-Resolution of Radar Rainfall
B. Demiray
M. Sit
Ibrahim Demir
59
4
0
09 Mar 2023
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Ziheng Qin
Kaidi Wang
Zangwei Zheng
Jianyang Gu
Xiang Peng
...
Daquan Zhou
Lei Shang
Baigui Sun
Xuansong Xie
Yang You
187
53
0
08 Mar 2023
Diffusing Gaussian Mixtures for Generating Categorical Data
Diffusing Gaussian Mixtures for Generating Categorical Data
Florence Regol
Mark Coates
DiffM
85
5
0
08 Mar 2023
Judging Adam: Studying the Performance of Optimization Methods on ML4SE
  Tasks
Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks
D. Pasechnyuk
Anton Prazdnichnykh
Mikhail Evtikhiev
T. Bryksin
67
1
0
06 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative
  Language Model
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
118
7
0
06 Mar 2023
Fixed-point quantization aware training for on-device keyword-spotting
Fixed-point quantization aware training for on-device keyword-spotting
Sashank Macha
Om Oza
Alex Escott
Francesco Calivá
Robert M. Armitano
S. Cheekatmalla
S. Parthasarathi
Yuzong Liu
MQ
42
4
0
04 Mar 2023
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves
  Generalization
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Hao Zou
Peng Cui
77
41
0
03 Mar 2023
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable
  Transformers
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Tianlong Chen
Zhenyu Zhang
Ajay Jaiswal
Shiwei Liu
Zhangyang Wang
MoE
116
50
0
02 Mar 2023
Consistency Models
Consistency Models
Yang Song
Prafulla Dhariwal
Mark Chen
Ilya Sutskever
VLMDiffM
119
982
0
02 Mar 2023
BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance
  Whole Slide Image Classification
BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance Whole Slide Image Classification
Daniel Sens
Ario Sadafi
F. P. Casale
Nassir Navab
Carsten Marr
ViTMedIm
39
1
0
02 Mar 2023
I2V: Towards Texture-Aware Self-Supervised Blind Denoising using
  Self-Residual Learning for Real-World Images
I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images
Kanggeun Lee
K. Lee
Won-Ki Jeong
69
0
0
21 Feb 2023
One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2
One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2
Trevine Oorloff
Yaser Yacoob
CVBM
53
3
0
15 Feb 2023
The Role of Semantic Parsing in Understanding Procedural Text
The Role of Semantic Parsing in Understanding Procedural Text
Hossein Rajaby Faghihi
Parisa Kordjamshidi
C. Teng
J. Allen
67
5
0
14 Feb 2023
Symbolic Discovery of Optimization Algorithms
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
176
381
0
13 Feb 2023
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
  Dual Averaging
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging
Junyi Li
Feihu Huang
Heng-Chiao Huang
FedML
74
1
0
13 Feb 2023
Multi-scale Feature Alignment for Continual Learning of Unlabeled
  Domains
Multi-scale Feature Alignment for Continual Learning of Unlabeled Domains
Kevin Thandiackal
Luigi Piccinelli
Pushpak Pati
O. Goksel
CLLOODMedIm
79
7
0
02 Feb 2023
On Suppressing Range of Adaptive Stepsizes of Adam to Improve
  Generalisation Performance
On Suppressing Range of Adaptive Stepsizes of Adam to Improve Generalisation Performance
Guoqiang Zhang
ODL
52
4
0
02 Feb 2023
A Survey of Deep Learning: From Activations to Transformers
A Survey of Deep Learning: From Activations to Transformers
Johannes Schneider
Michalis Vlachos
ViTMedImAI4TSAI4CE
112
10
0
01 Feb 2023
Previous
123...567...161718
Next