Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.08610
Cited By
Lookahead Optimizer: k steps forward, 1 step back
19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Lookahead Optimizer: k steps forward, 1 step back"
50 / 347 papers shown
Title
Locally Optimal Descent for Dynamic Stepsize Scheduling
Gilad Yehudai
Alon Cohen
Amit Daniely
Yoel Drori
Tomer Koren
Mariano Schain
26
0
0
23 Nov 2023
A Coefficient Makes SVRG Effective
Yida Yin
Zhiqiu Xu
Zhiyuan Li
Trevor Darrell
Zhuang Liu
25
1
0
09 Nov 2023
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization
Liang Zhang
Junchi Yang
Amin Karbasi
Niao He
26
2
0
26 Oct 2023
Implicit meta-learning may lead language models to trust more reliable sources
Dmitrii Krasheninnikov
Egor Krasheninnikov
Bruno Mlodozeniec
Tegan Maharaj
David M. Krueger
26
3
0
23 Oct 2023
Stable Nonconvex-Nonconcave Training via Linear Interpolation
Thomas Pethick
Wanyun Xie
V. Cevher
27
5
0
20 Oct 2023
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei-Neng Chen
Khaled B. Letaief
FedML
23
11
0
16 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
28
51
0
27 Sep 2023
Exploring Flat Minima for Domain Generalization with Large Learning Rates
Jian Zhang
Lei Qi
Yinghuan Shi
Yang Gao
33
2
0
12 Sep 2023
Estimating exercise-induced fatigue from thermal facial images
Manuel Lage Cañellas
Constantino Álvarez Casado
L. Nguyen
Miguel Bordallo López
CVBM
16
0
0
12 Sep 2023
Stabilizing RNN Gradients through Pre-training
Luca Herranz-Celotti
Jean Rouat
24
0
0
23 Aug 2023
Deepbet: Fast brain extraction of T1-weighted MRI using Convolutional Neural Networks
L. Fisch
Stefan Zumdick
Carlotta B. C. Barkhau
D. Emden
J. Ernsting
...
K. Sarink
N. Winter
Benjamin Risse
U. Dannlowski
Tim Hahn
10
4
0
14 Aug 2023
MomentaMorph: Unsupervised Spatial-Temporal Registration with Momenta, Shooting, and Correction
Zhangxing Bian
Shuwen Wei
Yihao Liu
Junyu Chen
J. Zhuo
Fangxu Xing
Jonghye Woo
A. Carass
Jerry L. Prince
MedIm
16
2
0
05 Aug 2023
Multimodal Indoor Localisation in Parkinson's Disease for Detecting Medication Use: Observational Pilot Study in a Free-Living Setting
Ferdian Jovan
Catherine Morgan
Ryan McConville
E. Tonkin
I. Craddock
Alan Whone
23
2
0
03 Aug 2023
MFIM: Megapixel Facial Identity Manipulation
Sanghyeon Na
PICV
CVBM
38
4
0
03 Aug 2023
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
45
1
0
31 Jul 2023
LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning
Tiezhu Sun
Weiguo Pian
N. Daoudi
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
31
1
0
30 Jul 2023
StylePrompter: All Styles Need Is Attention
Chenyi Zhuang
Pan Gao
A. Smolic
32
1
0
30 Jul 2023
Cross-dimensional transfer learning in medical image segmentation with deep learning
Hicham Messaoudi
Ahror Belaid
Douraied BEN SALEM
Pierre-Henri Conze
MedIm
22
23
0
29 Jul 2023
TransNet: Transparent Object Manipulation Through Category-Level Pose Estimation
Huijie Zhang
Anthony Opipari
Xiaotong Chen
Jiyue Zhu
Zeren Yu
Odest Chadwicke Jenkins
11
1
0
23 Jul 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Pranshu Malviya
Gonçalo Mordido
A. Baratin
Reza Babanezhad Harikandeh
Jerry Huang
Simon Lacoste-Julien
Razvan Pascanu
Sarath Chandar
ODL
17
1
0
18 Jul 2023
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification
Simon Holdenried-Krafft
Peter Somers
Ivonne A. Montes-Majarro
Diana Silimon
Cristina Tarín
F. Fend
Hendrik P. A. Lensch
MedIm
27
3
0
14 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
20
41
0
12 Jul 2023
The Whole Pathological Slide Classification via Weakly Supervised Learning
Qiehe Sun
Jiawen Li
Jin Xu
Junru Cheng
Tian Guan
Yonghong He
29
0
0
12 Jul 2023
Neural Architecture Transfer 2: A Paradigm for Improving Efficiency in Multi-Objective Neural Architecture Search
Simone Sarti
Eugenio Lomurno
Matteo Matteucci
19
0
0
03 Jul 2023
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers
Yineng Chen
Z. Li
Lefei Zhang
Bo Du
Hai Zhao
25
4
0
02 Jul 2023
Structured State Space Models for Multiple Instance Learning in Digital Pathology
Leo Fillioux
Joseph Boyd
Maria Vakalopoulou
P. Cournède
Stergios Christodoulidis
6
21
0
27 Jun 2023
Efficient ResNets: Residual Network Design
Aditya Thakur
Harish Chauhan
Nikunj Gupta
19
0
0
21 Jun 2023
Partial Hypernetworks for Continual Learning
Hamed Hemati
Vincenzo Lomonaco
D. Bacciu
Damian Borth
CLL
11
7
0
19 Jun 2023
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
Kanami Imamura
Tomohiko Nakamura
Norihiro Takamune
Kohei Yatabe
Hiroshi Saruwatari
15
1
0
19 Jun 2023
Lookaround Optimizer:
k
k
k
steps around, 1 step average
Jiangtao Zhang
Shunyu Liu
Jie Song
Tongtian Zhu
Zhenxing Xu
Mingli Song
MoMe
29
6
0
13 Jun 2023
Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo Data
Nikolay Patakin
Mikhail Romanov
Anna Vorontsova
M. Artemyev
Anton Konushin
MDE
41
6
0
05 Jun 2023
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
25
0
0
25 May 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Qihuang Zhong
Liang Ding
Juhua Liu
Xuebo Liu
Min Zhang
Bo Du
Dacheng Tao
VLM
29
9
0
24 May 2023
Beyond Individual Input for Deep Anomaly Detection on Tabular Data
Hugo Thimonier
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
16
5
0
24 May 2023
Deep Multiple Instance Learning with Distance-Aware Self-Attention
Georg Wolflein
Lucie Charlotte Magister
Pietro Lio'
David J. Harrison
Ognjen Arandjelovic
19
2
0
17 May 2023
Semi-Supervised Segmentation of Functional Tissue Units at the Cellular Level
V. Sydorskyi
Igor Krashenyi
Denis Savka
Oleksandr Zarichkovyi
14
1
0
03 May 2023
SketchXAI: A First Look at Explainability for Human Sketches
Zhiyu Qu
Yulia Gryaditskaya
Ke Li
Kaiyue Pang
Tao Xiang
Yi-Zhe Song
21
8
0
23 Apr 2023
Hierarchical Weight Averaging for Deep Neural Networks
Xiaozhe Gu
Zixun Zhang
Yuncheng Jiang
Tao Luo
Ruimao Zhang
Shuguang Cui
Zhuguo Li
19
5
0
23 Apr 2023
Neuromorphic computing for attitude estimation onboard quadrotors
S. Stroobants
Julien Dupeyroux
Guido de Croon
27
4
0
18 Apr 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking
A. Nalmpantis
Apostolos Panagiotopoulos
John Gkountouras
Konstantinos Papakostas
Wilker Aziz
15
4
0
13 Apr 2023
β
β
β
-Variational autoencoders and transformers for reduced-order modelling of fluid flows
Alberto Solera-Rico
Carlos Sanmiguel Vila
Miguel Gómez-López
Yuning Wang
Abdulrahman Almashjary
Scott T. M. Dawson
Ricardo Vinuesa
DRL
13
74
0
07 Apr 2023
Astroformer: More Data Might not be all you need for Classification
Rishit Dagli
28
7
0
03 Apr 2023
HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation
Linfang Zheng
Chen Wang
Ying Sun
Esha Dasgupta
Hua Chen
A. Leonardis
Wei K. Zhang
H. Chang
3DPC
29
41
0
28 Mar 2023
TriPlaneNet: An Encoder for EG3D Inversion
A. Bhattarai
Matthias Nießner
Artem Sevastopolsky
36
34
0
23 Mar 2023
Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding
Ziyang Yuan
Yiming Zhu
Yu Li
Hongyu Liu
Chun Yuan
3DV
29
37
0
22 Mar 2023
Picture that Sketch: Photorealistic Image Generation from Abstract Sketches
Subhadeep Koley
A. Bhunia
Aneeshan Sain
Pinaki Nath Chowdhury
Tao Xiang
Yi-Zhe Song
3DH
19
31
0
20 Mar 2023
Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks
D. Pasechnyuk
Anton Prazdnichnykh
Mikhail Evtikhiev
T. Bryksin
24
1
0
06 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Rui Xue
Yanqing Liu
Lei He
Xuejiao Tan
Linquan Liu
Ed Lin
Sheng Zhao
26
7
0
06 Mar 2023
Dropout Reduces Underfitting
Zhuang Liu
Zhi-Qin John Xu
Joseph Jin
Zhiqiang Shen
Trevor Darrell
34
36
0
02 Mar 2023
BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance Whole Slide Image Classification
Daniel Sens
Ario Sadafi
F. P. Casale
Nassir Navab
Carsten Marr
ViT
MedIm
17
1
0
02 Mar 2023
Previous
1
2
3
4
5
6
7
Next