v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019

Xiaodong Liu

ArXiv (abs)PDF HTML Github (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown

Title
Motion Puzzle: Arbitrary Motion Style Transfer by Body Part Deok-Kyeong Jang S. Park Sung-Hee Lee 3DH 70 60 0 10 Feb 2022
Particle Transformer for Jet Tagging H. Qu Congqiao Li Sitian Qian ViT MedIm 85 106 0 08 Feb 2022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models Chen Liang Haoming Jiang Simiao Zuo Pengcheng He Xiaodong Liu Jianfeng Gao Weizhu Chen T. Zhao 72 14 0 06 Feb 2022
Boundary-aware Information Maximization for Self-supervised Medical Image Segmentation Jizong Peng Ping Wang M. Pedersoli Christian Desrosiers SSL 77 6 0 04 Feb 2022
Global Optimization Networks Sen Zhao Erez Louidor Ilan Oleksandr Mangylov Maya R. Gupta 111 6 0 02 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning Zeke Xie Qian-Yuan Tang Yunfeng Cai Mingming Sun P. Li ODL 99 10 0 31 Jan 2022
A Stochastic Bundle Method for Interpolating Networks Alasdair Paren Leonard Berrada Rudra P. K. Poudel M. P. Kumar 76 4 0 29 Jan 2022
Benchmarking Conventional Vision Models on Neuromorphic Fall Detection and Action Recognition Dataset Karthik Sivarama Krishnan Koushik Sivarama Krishnan 48 5 0 28 Jan 2022
Learning to Minimize the Remainder in Supervised Learning Yan Luo Yongkang Wong Mohan S. Kankanhalli Qi Zhao 97 1 0 23 Jan 2022
AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization Wendyam Eric Lionel Ilboudo Taisuke Kobayashi Takamitsu Matsubara 89 13 0 18 Jan 2022
Generalization in Supervised Learning Through Riemannian Contraction L. Kozachkov Patrick M. Wensing Jean-Jacques E. Slotine MLT 91 9 0 17 Jan 2022
Data-Efficient Information Extraction from Form-Like Documents Beliz Gunel Navneet Potti Sandeep Tata James Bradley Wendt Marc Najork Jing Xie 48 2 0 07 Jan 2022
Sign Language Video Retrieval with Free-Form Textual Queries A. Duarte Samuel Albanie Xavier Giró-i-Nieto Gül Varol SLR 88 29 0 07 Jan 2022
Including STDP to eligibility propagation in multi-layer recurrent spiking neural networks Werner van der Veen 77 1 0 05 Jan 2022
Class-Incremental Continual Learning into the eXtended DER-verse Matteo Boschini Lorenzo Bonicelli Pietro Buzzega Angelo Porrello Simone Calderara CLL BDL 109 142 0 03 Jan 2022
PointCaps: Raw Point Cloud Processing using Capsule Networks with Euclidean Distance Routing Dishanika Denipitiyage Vinoj Jayasundara Ranga Rodrigo Chamira U. S. Edussooriya 3DPC 62 6 0 21 Dec 2021
Audio Retrieval with Natural Language Queries: A Benchmark Study A. Sophia Koepke Andreea-Maria Oncescu João F. Henriques Zeynep Akata Samuel Albanie 78 102 0 17 Dec 2021
Improving Unsupervised Stain-To-Stain Translation using Self-Supervision and Meta-Learning Nassim Bouteldja B. Klinkhammer Tarek Schlaich P. Boor Dorit Merhof MedIm 48 21 0 16 Dec 2021
Self-Supervised Bot Play for Conversational Recommendation with Justifications Shuyang Li Bodhisattwa Prasad Majumder Julian McAuley 86 7 0 09 Dec 2021
Information is Power: Intrinsic Control via Information Capture Nick Rhinehart Jenny Wang Glen Berseth John D. Co-Reyes Danijar Hafner Chelsea Finn Sergey Levine 60 9 0 07 Dec 2021
In-flight Novelty Detection with Convolutional Neural Networks A. Hartwell Felipe J. Montana William R. Jacobs V. Kadirkamanathan A. Mills Tom S. Clark 48 6 0 07 Dec 2021
More layers! End-to-end regression and uncertainty on tabular data with deep learning Ivan Bondarenko OOD LMTD UQCV 55 4 0 07 Dec 2021
A Novel Convergence Analysis for Algorithms of the Adam Family Zhishuai Guo Yi Tian Xu W. Yin Rong Jin Tianbao Yang 88 49 0 07 Dec 2021
JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering Yueqing Sun Qi Shi Le Qi Yu Zhang RALM LRM 89 72 0 06 Dec 2021
HyperInverter: Improving StyleGAN Inversion via Hypernetwork Tan M. Dinh Anh Tran Rang Nguyen Binh-Son Hua 81 119 0 01 Dec 2021
Environmental Sound Extraction Using Onomatopoeic Words Yuki Okamoto Shota Horiguchi Masaaki Yamamoto Keisuke Imoto Yohei Kawaguchi 69 9 0 01 Dec 2021
Adaptive Optimization with Examplewise Gradients Julius Kunze James Townsend David Barber ODL 46 0 0 30 Nov 2021
DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation Lukas Hoyer Dengxin Dai Luc Van Gool AI4CE 107 462 0 29 Nov 2021
Buildings Classification using Very High Resolution Satellite Imagery Mohammad Dimassi A. Samhat Mohammad Zaraket Jamal Haydar Mustafa Shukor A. Ghandour 37 3 0 29 Nov 2021
Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion Nobuhiko Wakai Satoshi Sato Yasunori Ishii Takayoshi Yamashita 74 8 0 25 Nov 2021
Rethinking the modeling of the instrumental response of telescopes with a differentiable optical model T. Liaudat Jean-Luc Starck M. Kilbinger P. Frugier 52 9 0 24 Nov 2021
Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks Ángel López García-Arias Masanori Hashimoto Masato Motomura Jaehoon Yu 66 5 0 24 Nov 2021
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling Shota Orihashi Yoshihiro Yamazaki Naoki Makishima Mana Ihori Akihiko Takashima Tomohiro Tanaka Ryo Masumura 41 0 0 22 Nov 2021
Capitalization and Punctuation Restoration: a Survey V. Pais D. Tufis 80 19 0 21 Nov 2021
Diversified Multi-prototype Representation for Semi-supervised Segmentation Jizong Peng Christian Desrosiers M. Pedersoli 57 1 0 16 Nov 2021
Deep Network Approximation in Terms of Intrinsic Parameters Zuowei Shen Haizhao Yang Shijun Zhang 64 9 0 15 Nov 2021
CoreLM: Coreference-aware Language Model Fine-Tuning Nikolaos Stylianou I. Vlahavas 60 2 0 04 Nov 2021
Conformal prediction for text infilling and part-of-speech prediction N. Dey Jing Ding Jack G. Ferrell Carolina Kapper Maxwell Lovig Emiliano Planchon Jonathan P. Williams UQLM 140 21 0 04 Nov 2021
LogAvgExp Provides a Principled and Performant Global Pooling Operator S. Lowe Thomas Trappenberg Sageev Oore FAtt 53 2 0 02 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey Xiaoxin He Fuzhao Xue Xiaozhe Ren Yang You 90 15 0 01 Nov 2021
Fully convolutional Siamese neural networks for buildings damage assessment from satellite images Eugene Khvedchenya Tatiana Gabruseva 43 9 0 31 Oct 2021
Whole Brain Segmentation with Full Volume Neural Network Yeshu Li Jianwei Cui Yilun Sheng Xiao Liang Jingdong Wang E. Chang Yan Xu 146 11 0 29 Oct 2021
OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich Tim Prangemeier Ozdemir Cetin Heinz Koeppl 68 12 0 20 Oct 2021
Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization Tao Sun Huaming Ling Zuoqiang Shi Dongsheng Li Bao Wang ODL 65 13 0 18 Oct 2021
Hierarchical Curriculum Learning for AMR Parsing Peiyi Wang Liang Chen Tianyu Liu Damai Dai Yunbo Cao Baobao Chang Zhifang Sui 113 15 0 15 Oct 2021
Dynamic Inference with Neural Interpreters Nasim Rahaman Muhammad Waleed Gondal S. Joshi Peter V. Gehler Yoshua Bengio Francesco Locatello Bernhard Schölkopf 105 31 0 12 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs Xiaohui Wang Yang Wei Ying Xiong Guyue Huang Xian Qian Yufei Ding Mingxuan Wang Lei Li VLM 62 33 0 12 Oct 2021
Momentum Centering and Asynchronous Update for Adaptive Gradient Methods Juntang Zhuang Yifan Ding Tommy M. Tang Nicha Dvornek S. Tatikonda James S. Duncan ODL 55 4 0 11 Oct 2021
Vision Transformer based COVID-19 Detection using Chest X-rays Koushik Sivarama Krishnan Karthik Sivarama Krishnan ViT MedIm 68 57 0 09 Oct 2021
Taming Sparsely Activated Transformer with Stochastic Experts Simiao Zuo Xiaodong Liu Jian Jiao Young Jin Kim Hany Hassan Ruofei Zhang T. Zhao Jianfeng Gao MoE 123 115 0 08 Oct 2021