v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019

Xiaodong Liu

ArXiv (abs)PDF HTML Github (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown

Title
Extrapolatable Relational Reasoning With Comparators in Low-Dimensional Manifolds Duo Wang M. Jamnik Pietro Lio OOD 29 1 0 15 Jun 2020
The Limit of the Batch Size Yang You Yuhui Wang Huan Zhang Zhao-jie Zhang J. Demmel Cho-Jui Hsieh 121 15 0 15 Jun 2020
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights Byeongho Heo Sanghyuk Chun Seong Joon Oh Dongyoon Han Sangdoo Yun Gyuwan Kim Youngjung Uh Jung-Woo Ha ODL 375 27 0 15 Jun 2020
BI-MAML: Balanced Incremental Approach for Meta Learning Yang Zheng Jinlin Xiang Kun Su Eli Shlizerman CLL 48 10 0 12 Jun 2020
Getting a CLUE: A Method for Explaining Uncertainty Estimates Javier Antorán Umang Bhatt T. Adel Adrian Weller José Miguel Hernández-Lobato UQCV BDL 110 116 0 11 Jun 2020
Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search) Sharan Vaswani I. Laradji Frederik Kunstner S. Meng Mark Schmidt Simon Lacoste-Julien 142 27 0 11 Jun 2020
AdaS: Adaptive Scheduling of Stochastic Gradients Mahdi S. Hosseini Konstantinos N. Plataniotis ODL 80 12 0 11 Jun 2020
Deep learning of contagion dynamics on complex networks Charles Murphy Edward Laurence Antoine Allard GNN AI4CE 22 70 0 09 Jun 2020
Detecting structural perturbations from time series with deep learning Edward Laurence Charles Murphy G. St‐Onge Xavier Roy-Pomerleau Vincent Thibeault 37 3 0 09 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines Marius Mosbach Maksym Andriushchenko Dietrich Klakow 187 363 0 08 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention Pengcheng He Xiaodong Liu Jianfeng Gao Weizhu Chen AAML 189 2,770 0 05 Jun 2020
sEMG Gesture Recognition with a Simple Model of Attention David Josephs Carson Drake Andrew Heroy John Santerre 64 48 0 05 Jun 2020
Novel Object Viewpoint Estimation through Reconstruction Alignment Mohamed El Banani Jason J. Corso David Fouhey 122 14 0 05 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning Z. Yao A. Gholami Sheng Shen Mustafa Mustafa Kurt Keutzer Michael W. Mahoney ODL 152 287 0 01 Jun 2020
Universal Lesion Detection by Learning from Multiple Heterogeneously Labeled Datasets K. Yan Jinzheng Cai Adam P. Harrison D. Jin Jing Xiao Le Lu 63 18 0 28 May 2020
Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification Zhuotun Zhu K. Yan D. Jin Jinzheng Cai T. Ho ... Chun-Hung Chao X. Ye Jing Xiao Alan Yuille Le Lu 63 10 0 27 May 2020
Few-shot Compositional Font Generation with Dual Memory Junbum Cha Sanghyuk Chun Gayoung Lee Bado Lee Seonghyeon Kim Hwalsuk Lee 70 80 0 21 May 2020
ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition Jing Pan Joshua Shapiro Jeremy Wohlwend Kyu Jeong Han Tao Lei T. Ma 72 22 0 21 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation Yi-Chiao Wu Tomoki Hayashi T. Okamoto Hisashi Kawai Tomoki Toda 73 4 0 18 May 2020
JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment D. Lim Won Jang Gyeonghwan O Heayoung Park Bongwan Kim Jaesam Yoon 68 37 0 15 May 2020
Neural Networks Versus Conventional Filters for Inertial-Sensor-based Attitude Estimation Daniel Weber C. Gühmann Thomas Seel 37 35 0 14 May 2020
DiscreTalk: Text-to-Speech as a Machine Translation Problem Tomoki Hayashi Shinji Watanabe 70 32 0 12 May 2020
Benchmark Tests of Convolutional Neural Network and Graph Convolutional Network on HorovodRunner Enabled Spark Clusters Jing Pan Wendao Liu Jing Zhou GNN BDL 14 2 0 12 May 2020
2kenize: Tying Subword Sequences for Chinese Script Conversion Pranav A Isabelle Augenstein 66 1 0 07 May 2020
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering Yanlin Feng Xinyue Chen Bill Yuchen Lin Peifeng Wang Jun Yan Xiang Ren LRM KELM 79 246 0 01 May 2020
BlackBox: Generalizable Reconstruction of Extremal Values from Incomplete Spatio-Temporal Data T. Ivek Domagoj Vlah 66 4 0 30 Apr 2020
How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization P. DÓro Wojciech Ja'skowski OffRL 92 27 0 29 Apr 2020
Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (to STRIPS) Masataro Asai Christian Muise 64 8 0 27 Apr 2020
Five Points to Check when Comparing Visual Perception in Humans and Machines Christina M. Funke Judy Borowski Karolina Stosio Wieland Brendel Thomas S. A. Wallis Matthias Bethge 71 33 0 20 Apr 2020
Adversarial Training for Large Neural Language Models Xiaodong Liu Hao Cheng Pengcheng He Weizhu Chen Yu Wang Hoifung Poon Jianfeng Gao AAML 94 186 0 20 Apr 2020
Organ at Risk Segmentation for Head and Neck Cancer using Stratified Learning and Neural Architecture Search Dazhou Guo D. Jin Zhuotun Zhu T. Ho Adam P. Harrison Chun-Hung Chao Jing Xiao Alan Yuille Chien-Yu Lin Le Lu 85 59 0 17 Apr 2020
Understanding the Difficulty of Training Transformers Liyuan Liu Xiaodong Liu Jianfeng Gao Weizhu Chen Jiawei Han AI4CE 85 258 0 17 Apr 2020
A Cross-Stitch Architecture for Joint Registration and Segmentation in Adaptive Radiotherapy Laurens Beljaards Mohamed S. Elmahdy F. Verbeek Marius Staring 3DV 60 15 0 17 Apr 2020
Show Us the Way: Learning to Manage Dialog from Demonstrations Gabriel Gordon-Hall P. Gorinski Gerasimos Lampouras Ignacio Iacobacci OffRL 109 11 0 17 Apr 2020
An Adaptive Intelligence Algorithm for Undersampled Knee MRI Reconstruction Nicola Pezzotti Sahar Yousefi Mohamed S. Elmahdy J. V. Gemert C. Schulke ... Sergey Kastryulin B. Lelieveldt M. Osch E. Weerdt Marius Staring 70 100 0 15 Apr 2020
Self6D: Self-Supervised Monocular 6D Object Pose Estimation Gu Wang Fabian Manhardt Jianzhun Shao Xiangyang Ji Nassir Navab Federico Tombari SSL MDE 100 136 0 14 Apr 2020
Interview: A Large-Scale Open-Source Corpus of Media Dialog Bodhisattwa Prasad Majumder Shuyang Li Jianmo Ni Julian McAuley AuLLM 58 4 0 07 Apr 2020
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences Andis Draguns Emīls Ozoliņš A. Sostaks Matiss Apinis Kārlis Freivalds 56 8 0 06 Apr 2020
Applying Cyclical Learning Rate to Neural Machine Translation Choon Meng Lee Jianfeng Liu Wei Peng ODL 27 2 0 06 Apr 2020
Towards Lifelong Self-Supervision For Unpaired Image-to-Image Translation Victor Schmidt Makesh Narsimhan Sreedhar M. Elaraby Irina Rish SSL CLL 45 2 0 31 Mar 2020
Nonconvex sparse regularization for deep neural networks and its optimality Ilsang Ohn Yongdai Kim 71 19 0 26 Mar 2020
iTAML: An Incremental Task-Agnostic Meta-learning Approach Jathushan Rajasegaran Salman Khan Munawar Hayat Fahad Shahbaz Khan M. Shah CLL OOD 146 157 0 25 Mar 2020
Multi-Plateau Ensemble for Endoscopic Artefact Segmentation and Detection Suyog Jadhav Udbhav Bamba Arnav Chavan Rishabh Tiwari A. Raj 38 3 0 23 Mar 2020
End-to-End Deep Diagnosis of X-ray Images Kudaibergen Urinbayev Yerassyl Orazbek Yernur Nurambek A. Mirzakhmetov H. A. Varol MedIm 30 12 0 19 Mar 2020
Getting to 99% Accuracy in Interactive Segmentation Marco Forte Brian L. Price Scott D. Cohen N. Xu Franccois Pitié 155 39 0 17 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation Huiyu Wang Yukun Zhu Bradley Green Hartwig Adam Alan Yuille Liang-Chieh Chen 3DPC 136 674 0 17 Mar 2020
$F$ , $B$ , Alpha Matting Marco Forte François Pitié 106 85 0 17 Mar 2020
PLOP: Probabilistic poLynomial Objects trajectory Planning for autonomous driving Thibault Buhet É. Wirbel Andrei Bursuc Xavier Perrotton 76 33 0 09 Mar 2020
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule Nikhil Iyer V. Thejas Nipun Kwatra Ramachandran Ramjee Muthian Sivathanu 89 29 0 09 Mar 2020
Colored Noise Injection for Training Adversarially Robust Neural Networks Evgenii Zheltonozhskii Chaim Baskin Yaniv Nemcovsky Brian Chmiel A. Mendelson A. Bronstein AAML 32 5 0 04 Mar 2020