ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.08610
  4. Cited By
Lookahead Optimizer: k steps forward, 1 step back

Lookahead Optimizer: k steps forward, 1 step back

19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
    ODL
ArXivPDFHTML

Papers citing "Lookahead Optimizer: k steps forward, 1 step back"

50 / 347 papers shown
Title
Domain-Specific Pre-training Improves Confidence in Whole Slide Image
  Classification
Domain-Specific Pre-training Improves Confidence in Whole Slide Image Classification
S. Chitnis
Sidong Liu
T. Dash
T. Verlekar
A. Di Ieva
S. Berkovsky
L. Vig
A. Srinivasan
19
4
0
20 Feb 2023
One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2
One-Shot Face Video Re-enactment using Hybrid Latent Spaces of StyleGAN2
Trevine Oorloff
Yaser Yacoob
CVBM
32
3
0
15 Feb 2023
Unlocking Deterministic Robustness Certification on ImageNet
Unlocking Deterministic Robustness Certification on ImageNet
Kaiqin Hu
Andy Zou
Zifan Wang
Klas Leino
Matt Fredrikson
OOD
21
12
0
29 Jan 2023
What Decreases Editing Capability? Domain-Specific Hybrid Refinement for
  Improved GAN Inversion
What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion
Pu Cao
Lu Yang
Dongxu Liu
Zhiwei Liu
Shan Li
Q. Song
27
6
0
28 Jan 2023
FewShotTextGCN: K-hop neighborhood regularization for few-shot learning
  on graphs
FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs
Niels van der Heijden
Ekaterina Shutova
H. Yannakoudakis
23
0
0
25 Jan 2023
Read the Signs: Towards Invariance to Gradient Descent's Hyperparameter
  Initialization
Read the Signs: Towards Invariance to Gradient Descent's Hyperparameter Initialization
Davood Wadi
M. Fredette
S. Sénécal
ODL
AI4CE
6
0
0
24 Jan 2023
Multi-fidelity surrogate modeling for temperature field prediction using
  deep convolution neural network
Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network
Yunyang Zhang
Zhiqiang Gong
Weien Zhou
Xiaoyu Zhao
Xiaohu Zheng
W. Yao
AI4CE
15
24
0
17 Jan 2023
Improving Depression estimation from facial videos with face alignment,
  training optimization and scheduling
Improving Depression estimation from facial videos with face alignment, training optimization and scheduling
Manuel Lage Cañellas
Constantino Álvarez Casado
L. Nguyen
Miguel Bordallo López
CVBM
14
3
0
13 Dec 2022
Real-time Sampling-based Model Predictive Control based on Reverse
  Kullback-Leibler Divergence and Its Adaptive Acceleration
Real-time Sampling-based Model Predictive Control based on Reverse Kullback-Leibler Divergence and Its Adaptive Acceleration
Taisuke Kobayashi
Kota Fukumoto
11
4
0
08 Dec 2022
A survey of deep learning optimizers -- first and second order methods
A survey of deep learning optimizers -- first and second order methods
Rohan Kashyap
ODL
31
6
0
28 Nov 2022
GAN Inversion for Image Editing via Unsupervised Domain Adaptation
GAN Inversion for Image Editing via Unsupervised Domain Adaptation
Siyu Xing
Chen Gong
Hewei Guo
Xiaoyi Zhang
Xinwen Hou
Yu Liu
32
6
0
22 Nov 2022
Efficient Generalization Improvement Guided by Random Weight
  Perturbation
Efficient Generalization Improvement Guided by Random Weight Perturbation
Tao Li
Wei Yan
Zehao Lei
Yingwen Wu
Kun Fang
Ming Yang
X. Huang
AAML
35
6
0
21 Nov 2022
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space
  Viewpoint
Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space Viewpoint
Hongyu Liu
Yibing Song
Qifeng Chen
DiffM
28
21
0
21 Nov 2022
Can neural networks extrapolate? Discussion of a theorem by Pedro
  Domingos
Can neural networks extrapolate? Discussion of a theorem by Pedro Domingos
Adrien Courtois
Jean-Michel Morel
Pablo Arias
14
5
0
07 Nov 2022
Momentum-based Weight Interpolation of Strong Zero-Shot Models for
  Continual Learning
Momentum-based Weight Interpolation of Strong Zero-Shot Models for Continual Learning
Zafir Stojanovski
Karsten Roth
Zeynep Akata
18
16
0
06 Nov 2022
Iterative Teaching by Data Hallucination
Iterative Teaching by Data Hallucination
Zeju Qiu
Weiyang Liu
Tim Z. Xiao
Zhen Liu
Umang Bhatt
Yucen Luo
Adrian Weller
Bernhard Schölkopf
29
9
0
31 Oct 2022
Reduce Catastrophic Forgetting of Dense Retrieval Training with
  Teleportation Negatives
Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives
Si Sun
Chenyan Xiong
Yue Yu
Arnold Overwijk
Zhiyuan Liu
Jie Bao
35
6
0
31 Oct 2022
SAM as an Optimal Relaxation of Bayes
SAM as an Optimal Relaxation of Bayes
Thomas Möllenhoff
Mohammad Emtiyaz Khan
BDL
29
32
0
04 Oct 2022
Combined Dynamic Virtual Spatiotemporal Graph Mapping for Traffic
  Prediction
Combined Dynamic Virtual Spatiotemporal Graph Mapping for Traffic Prediction
Ying-Hung Pu
AI4TS
14
0
0
03 Oct 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with
  Latest Weight Averaging
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
19
39
0
29 Sep 2022
Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated
  Self-Attention
Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention
Jingwei Zhao
Gus Xia
Ye Wang
21
18
0
15 Sep 2022
SketchBetween: Video-to-Video Synthesis for Sprite Animation via
  Sketches
SketchBetween: Video-to-Video Synthesis for Sprite Animation via Sketches
Dagmar Lukka Loftsdóttir
Matthew J. Guzdial
VGen
15
3
0
01 Sep 2022
Interpretable (not just posthoc-explainable) medical claims modeling for
  discharge placement to prevent avoidable all-cause readmissions or death
Interpretable (not just posthoc-explainable) medical claims modeling for discharge placement to prevent avoidable all-cause readmissions or death
Joshua C. Chang
Ted L. Chang
Carson C. Chow
R. Mahajan
Sonya Mahajan
Joe Maisog
Shashaank Vattikuti
Hongjing Xia
FAtt
OOD
34
0
0
28 Aug 2022
Lottery Pools: Winning More by Interpolating Tickets without Increasing
  Training or Inference Cost
Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost
Lu Yin
Shiwei Liu
Fang Meng
Tianjin Huang
Vlado Menkovski
Mykola Pechenizkiy
19
13
0
23 Aug 2022
CM-MLP: Cascade Multi-scale MLP with Axial Context Relation Encoder for
  Edge Segmentation of Medical Image
CM-MLP: Cascade Multi-scale MLP with Axial Context Relation Encoder for Edge Segmentation of Medical Image
Jinkai Lv
Yuyong Hu
Quanshui Fu
Zhiwang Zhang
Yuqiang Hu
Lin Lv
Guoqing Yang
Jinpeng Li
Yi Zhao
MedIm
17
9
0
23 Aug 2022
TransNet: Category-Level Transparent Object Pose Estimation
TransNet: Category-Level Transparent Object Pose Estimation
Huijie Zhang
Anthony Opipari
Xiaotong Chen
Jiyue Zhu
Zeren Yu
Odest Chadwicke Jenkins
ViT
17
12
0
22 Aug 2022
SSP-Pose: Symmetry-Aware Shape Prior Deformation for Direct
  Category-Level Object Pose Estimation
SSP-Pose: Symmetry-Aware Shape Prior Deformation for Direct Category-Level Object Pose Estimation
Ruida Zhang
Yan Di
Fabian Manhardt
F. Tombari
Xiangyang Ji
9
35
0
13 Aug 2022
Regularizing Deep Neural Networks with Stochastic Estimators of Hessian
  Trace
Regularizing Deep Neural Networks with Stochastic Estimators of Hessian Trace
Yucong Liu
Shixing Yu
Tong Lin
20
1
0
11 Aug 2022
Boosting Video-Text Retrieval with Explicit High-Level Semantics
Boosting Video-Text Retrieval with Explicit High-Level Semantics
Haoran Wang
Di Xu
Dongliang He
Fu Li
Zhong Ji
Jungong Han
Errui Ding
24
11
0
08 Aug 2022
RBP-Pose: Residual Bounding Box Projection for Category-Level Pose
  Estimation
RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation
Ruida Zhang
Yan Di
Zhiqiang Lou
Fabian Manhardt
F. Tombari
Xiangyang Ji
3DPC
14
47
0
30 Jul 2022
PEA: Improving the Performance of ReLU Networks for Free by Using
  Progressive Ensemble Activations
PEA: Improving the Performance of ReLU Networks for Free by Using Progressive Ensemble Activations
Á. Utasi
27
0
0
28 Jul 2022
On the benefits of non-linear weight updates
On the benefits of non-linear weight updates
Paul Norridge
18
0
0
25 Jul 2022
Easy Batch Normalization
Easy Batch Normalization
Arip Asadulaev
Alexander Panfilov
Andrey Filchenkov
AAML
8
0
0
18 Jul 2022
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose
  Refinement
CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement
Xingyu Liu
Gu Wang
Yi Li
Xiangyang Ji
3DPC
19
28
0
17 Jul 2022
Benchopt: Reproducible, efficient and collaborative optimization
  benchmarks
Benchopt: Reproducible, efficient and collaborative optimization benchmarks
Thomas Moreau
Mathurin Massias
Alexandre Gramfort
Pierre Ablin
Pierre-Antoine Bannier Benjamin Charlier
...
Binh Duc Nguyen
A. Rakotomamonjy
Zaccharie Ramzi
Joseph Salmon
Samuel Vaiter
51
31
0
27 Jun 2022
NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation
  Challenge 2022
NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation Challenge 2022
Tsung-Ming Tai
O. Lanz
G. Fiameni
Yi-Kwan Wong
Sze-Sen Poon
Cheng-Kuang Lee
Ka Chun Cheung
Simon See
6
1
0
22 Jun 2022
Solving Constrained Variational Inequalities via a First-order Interior
  Point-based Method
Solving Constrained Variational Inequalities via a First-order Interior Point-based Method
Tong Yang
Michael I. Jordan
Tatjana Chavdarova
27
9
0
21 Jun 2022
Unified Recurrence Modeling for Video Action Anticipation
Unified Recurrence Modeling for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
O. Lanz
21
8
0
02 Jun 2022
Hopular: Modern Hopfield Networks for Tabular Data
Hopular: Modern Hopfield Networks for Tabular Data
Bernhard Schafl
Lukas Gruber
Angela Bitto-Nemling
Sepp Hochreiter
LMTD
25
27
0
01 Jun 2022
Deepfake Caricatures: Amplifying attention to artifacts increases
  deepfake detection by humans and machines
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
Xi Wang
A. Oliva
37
4
0
01 Jun 2022
Superposing Many Tickets into One: A Performance Booster for Sparse
  Neural Network Training
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training
Lu Yin
Vlado Menkovski
Meng Fang
Tianjin Huang
Yulong Pei
Mykola Pechenizkiy
D. Mocanu
Shiwei Liu
26
8
0
30 May 2022
AttentionCode: Ultra-Reliable Feedback Codes for Short-Packet
  Communications
AttentionCode: Ultra-Reliable Feedback Codes for Short-Packet Communications
Yulin Shao
Emre Ozfatura
A. Perotti
B. Popović
Deniz Gunduz
28
21
0
30 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
17
15
0
28 May 2022
Trainable Weight Averaging: Accelerating Training and Improving Generalization
Trainable Weight Averaging: Accelerating Training and Improving Generalization
Tao Li
Zhehao Huang
Yingwen Wu
Zhengbao He
Qinghua Tao
X. Huang
Chih-Jen Lin
MoMe
50
3
0
26 May 2022
Semi-Parametric Inducing Point Networks and Neural Processes
Semi-Parametric Inducing Point Networks and Neural Processes
R. Rastogi
Yair Schiff
Alon Hacohen
Zhaozhi Li
I-Hsiang Lee
Yuntian Deng
M. Sabuncu
Volodymyr Kuleshov
3DPC
24
6
0
24 May 2022
Diverse Weight Averaging for Out-of-Distribution Generalization
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
196
128
0
19 May 2022
Local Attention Graph-based Transformer for Multi-target Genetic
  Alteration Prediction
Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction
Daniel Reisenbüchler
S. J. Wagner
Melanie Boxberg
T. Peng
MedIm
24
22
0
13 May 2022
Multimodal Indoor Localisation for Measuring Mobility in Parkinson's
  Disease using Transformers
Multimodal Indoor Localisation for Measuring Mobility in Parkinson's Disease using Transformers
Ferdian Jovan
Ryan McConville
Catherine Morgan
E. Tonkin
Alan Whone
I. Craddock
9
1
0
12 May 2022
LAWS: Look Around and Warm-Start Natural Gradient Descent for Quantum
  Neural Networks
LAWS: Look Around and Warm-Start Natural Gradient Descent for Quantum Neural Networks
Zeyi Tao
Jindi Wu
Qi Xia
Qun Li
23
9
0
05 May 2022
CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification
CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification
Marcos V. Conde
Kerem Turgutlu
CLIP
VLM
28
94
0
29 Apr 2022
Previous
1234567
Next