v1v2v3v4 (latest)

On the Variance of the Adaptive Learning Rate and Beyond

8 August 2019

Xiaodong Liu

ArXiv (abs)PDF HTML Github (2548★)

Papers citing "On the Variance of the Adaptive Learning Rate and Beyond"

50 / 864 papers shown

Title
SING: A Plug-and-Play DNN Learning Technique Adrien Courtois Damien Scieur Jean-Michel Morel Pablo Arias Thomas Eboli 68 0 0 25 May 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods Junchi Yang Xiang Li Ilyas Fatkhullin Niao He 92 17 0 21 May 2023
Human-annotated label noise and their impact on ConvNets for remote sensing image scene classification Long Peng T. Wei Xuehong Chen Xiaobei Chen Rui Sun L. Wan Jin Chen Xiaolin Zhu NoLa 42 3 0 20 May 2023
$$\partial\mathbb{B}$ nets: learning discrete functions by gradient descent$ $\partial\mathbb{B}$ nets: learning discrete functions by gradient descent Ian Wright 126 0 0 12 May 2023
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks Yun Tang Anna Y. Sun Hirofumi Inaguma Xinyue Chen Ning Dong Xutai Ma Paden Tomasello J. Pino 108 22 0 04 May 2023
BranchNorm: Robustly Scaling Extremely Deep Transformers Yanjun Liu Xianfeng Zeng Fandong Meng Jie Zhou 77 3 0 04 May 2023
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular Level V. Sydorskyi Igor Krashenyi Denis Savka Oleksandr Zarichkovyi 41 1 0 03 May 2023
Environmental sound synthesis from vocal imitations and sound event labels Yuki Okamoto Keisuke Imoto Shinnosuke Takamichi Ryotaro Nagase Takahiro Fukumori Y. Yamashita 42 0 0 29 Apr 2023
Domain Adaptive and Generalizable Network Architectures and Training Strategies for Semantic Image Segmentation Lukas Hoyer Dengxin Dai Luc Van Gool AI4CE OOD 131 26 0 26 Apr 2023
Learning imaging mechanism directly from optical microscopy observations Ze-Hao Wang Long-Kun Shan Tong-Tian Weng Tianrun Chen Qiyuan Wang Xiang-Dong Chen Zhang Wang Guanghsheng Guo Hefei 230088 DiffM 28 1 0 25 Apr 2023
Universal Adversarial Backdoor Attacks to Fool Vertical Federated Learning in Cloud-Edge Collaboration Peng Chen Xin Du Zhihui Lu Hongfeng Chai FedML AAML 95 11 0 22 Apr 2023
Angle based dynamic learning rate for gradient descent Neel Mishra Kiran Ravish ODL 69 1 0 20 Apr 2023
Tetra-NeRF: Representing Neural Radiance Fields Using Tetrahedra Jonáš Kulhánek Torsten Sattler 109 51 0 19 Apr 2023
Bridging Discrete and Backpropagation: Straight-Through and Beyond Liyuan Liu Chengyu Dong Xiaodong Liu Bin Yu Jianfeng Gao BDL 89 23 0 17 Apr 2023
Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction Guillaume Jaume Anurag J. Vaidya Richard J. Chen Drew F. K. Williamson Paul Pu Liang Faisal Mahmood 91 51 0 13 Apr 2023
CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model Dingkang Liang Jiahao Xie Zhikang Zou Xiaoqing Ye Wei Xu Xiang Bai SSL CLIP VLM 109 57 0 09 Apr 2023
METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens Zhanyu Wang Lingqiao Liu Lei Wang Luping Zhou MedIm 77 76 0 05 Apr 2023
Revisiting Context Aggregation for Image Matting Qinglin Liu Xiaoqian Lv Quanling Meng Zonglin Li Xiangyuan Lan Shuo Yang Shengping Zhang Liqiang Nie 88 5 0 03 Apr 2023
Astroformer: More Data Might not be all you need for Classification Rishit Dagli 107 8 0 03 Apr 2023
Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization Mingze Yuan Yingda Xia Hexin Dong Zi Chen Jiawen Yao ... Bin Dong Jing Zhou Le Lu Ling Zhang Li Zhang OOD MedIm 57 23 0 01 Apr 2023
Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption Nobuhiko Wakai Satoshi Sato Yasunori Ishii Takayoshi Yamashita 85 5 0 30 Mar 2023
Exploring Deep Learning Methods for Classification of SAR Images: Towards NextGen Convolutions via Transformers Ashutosh Kumar Singh Vivek Kumar Singh 26 0 0 28 Mar 2023
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation Linfang Zheng Chen Wang Ying Sun Esha Dasgupta Hua Chen A. Leonardis Wei Zhang H. Chang 3DPC 97 44 0 28 Mar 2023
AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning Guoxian Song Hongyi Xu Jing Liu Tiancheng Zhi Yichun Shi Jianfeng Zhang Zihang Jiang Jiashi Feng S. Sang Linjie Luo 3DH 55 6 0 24 Mar 2023
Toward Open-domain Slot Filling via Self-supervised Co-training A. Mosharrof Moghis Fereidouni A.B. Siddique 50 1 0 24 Mar 2023
TriPlaneNet: An Encoder for EG3D Inversion A. Bhattarai Matthias Nießner Artem Sevastopolsky 85 35 0 23 Mar 2023
A Survey of Historical Learning: Learning Models with Learning History Xiang Li Ge Wu Lingfeng Yang Wenzhe Wang Renjie Song Jian Yang MU AI4TS 103 2 0 23 Mar 2023
Unsupervised Domain Adaptation for Training Event-Based Networks Using Contrastive Learning and Uncorrelated Conditioning Dayuan Jian Mohammad Rostami 83 14 0 22 Mar 2023
Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding Ziyang Yuan Yiming Zhu Yu Li Hongyu Liu Chun Yuan 3DV 64 37 0 22 Mar 2023
Picture that Sketch: Photorealistic Image Generation from Abstract Sketches Subhadeep Koley A. Bhunia Aneeshan Sain Pinaki Nath Chowdhury Tao Xiang Yi-Zhe Song 3DH 109 35 0 20 Mar 2023
Transformer Models for Type Inference in the Simply Typed Lambda Calculus: A Case Study in Deep Learning for Code Brando Miranda Avraham Shinnar V. Pestun B. Trager 37 3 0 15 Mar 2023
SPOTR: Spatio-temporal Pose Transformers for Human Motion Prediction A. A. Nargund Misha Sra ViT 90 2 0 11 Mar 2023
EfficientTempNet: Temporal Super-Resolution of Radar Rainfall B. Demiray M. Sit Ibrahim Demir 59 4 0 09 Mar 2023
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning Ziheng Qin Kaidi Wang Zangwei Zheng Jianyang Gu Xiang Peng ... Daquan Zhou Lei Shang Baigui Sun Xuansong Xie Yang You 187 53 0 08 Mar 2023
Diffusing Gaussian Mixtures for Generating Categorical Data Florence Regol Mark Coates DiffM 85 5 0 08 Mar 2023
Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks D. Pasechnyuk Anton Prazdnichnykh Mikhail Evtikhiev T. Bryksin 67 1 0 06 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model Rui Xue Yanqing Liu Lei He Xuejiao Tan Linquan Liu Ed Lin Sheng Zhao 118 7 0 06 Mar 2023
Fixed-point quantization aware training for on-device keyword-spotting Sashank Macha Om Oza Alex Escott Francesco Calivá Robert M. Armitano S. Cheekatmalla S. Parthasarathi Yuzong Liu MQ 42 4 0 04 Mar 2023
Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization Xingxuan Zhang Renzhe Xu Han Yu Hao Zou Peng Cui 77 41 0 03 Mar 2023
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers Tianlong Chen Zhenyu Zhang Ajay Jaiswal Shiwei Liu Zhangyang Wang MoE 116 50 0 02 Mar 2023
Consistency Models Yang Song Prafulla Dhariwal Mark Chen Ilya Sutskever VLM DiffM 119 982 0 02 Mar 2023
BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance Whole Slide Image Classification Daniel Sens Ario Sadafi F. P. Casale Nassir Navab Carsten Marr ViT MedIm 39 1 0 02 Mar 2023
I2V: Towards Texture-Aware Self-Supervised Blind Denoising using Self-Residual Learning for Real-World Images Kanggeun Lee K. Lee Won-Ki Jeong 69 0 0 21 Feb 2023
One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2 Trevine Oorloff Yaser Yacoob CVBM 53 3 0 15 Feb 2023
The Role of Semantic Parsing in Understanding Procedural Text Hossein Rajaby Faghihi Parisa Kordjamshidi C. Teng J. Allen 67 5 0 14 Feb 2023
Symbolic Discovery of Optimization Algorithms Xiangning Chen Chen Liang Da Huang Esteban Real Kaiyuan Wang ... Xuanyi Dong Thang Luong Cho-Jui Hsieh Yifeng Lu Quoc V. Le 176 381 0 13 Feb 2023
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging Junyi Li Feihu Huang Heng-Chiao Huang FedML 74 1 0 13 Feb 2023
Multi-scale Feature Alignment for Continual Learning of Unlabeled Domains Kevin Thandiackal Luigi Piccinelli Pushpak Pati O. Goksel CLL OOD MedIm 79 7 0 02 Feb 2023
On Suppressing Range of Adaptive Stepsizes of Adam to Improve Generalisation Performance Guoqiang Zhang ODL 52 4 0 02 Feb 2023
A Survey of Deep Learning: From Activations to Transformers Johannes Schneider Michalis Vlachos ViT MedIm AI4TS AI4CE 112 10 0 01 Feb 2023