ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.08610
  4. Cited By
Lookahead Optimizer: k steps forward, 1 step back

Lookahead Optimizer: k steps forward, 1 step back

19 July 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
    ODL
ArXivPDFHTML

Papers citing "Lookahead Optimizer: k steps forward, 1 step back"

50 / 347 papers shown
Title
A novel Neural-ODE model for the state of health estimation of lithium-ion battery using charging curve
A novel Neural-ODE model for the state of health estimation of lithium-ion battery using charging curve
Yiming Li
Man He
J. Liu
19
0
0
09 May 2025
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
DGSAM: Domain Generalization via Individual Sharpness-Aware Minimization
Youngjun Song
Youngsik Hwang
Jonghun Lee
Heechang Lee
Dong-Young Lim
AAML
49
0
0
30 Mar 2025
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
143
0
0
12 Mar 2025
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Hierarchical Semantic Compression for Consistent Image Semantic Restoration
Shengxi Li
Zifu Zhang
Mai Xu
Lai Jiang
Yufan Liu
Ce Zhu
DiffM
38
0
0
24 Feb 2025
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness
Emanuele Ballarin
A. Ansuini
Luca Bortolussi
AAML
62
0
0
20 Feb 2025
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning
Hrishikesh Gupta
S. Thalhammer
Jean-Baptiste Weibel
Alexander Haberl
Markus Vincze
29
0
0
31 Dec 2024
A Unified Analysis of Federated Learning with Arbitrary Client Participation
A Unified Analysis of Federated Learning with Arbitrary Client Participation
Shiqiang Wang
Mingyue Ji
FedML
37
55
0
31 Dec 2024
Transformer-based toxin-protein interaction analysis prioritizes
  airborne particulate matter components with potential adverse health effects
Transformer-based toxin-protein interaction analysis prioritizes airborne particulate matter components with potential adverse health effects
Yan Zhu
Shihao Wang
Yong Han
Yao Lu
Shulan Qiu
Ling Jin
Xiangdong Li
Weixiong Zhang
73
1
0
21 Dec 2024
Distributed Sign Momentum with Local Steps for Training Transformers
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
64
0
0
26 Nov 2024
Retinal Vessel Segmentation via Neuron Programming
Tingting Wu
Ruyi Min
Peixuan Song
Hengtao Guo
Tieyong Zeng
Feng-Lei Fan
31
0
0
17 Nov 2024
ParaGAN: A Scalable Distributed Training Framework for Generative
  Adversarial Networks
ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks
Ziji Shi
Jialin Li
Yang You
26
1
0
06 Nov 2024
OledFL: Unleashing the Potential of Decentralized Federated Learning via
  Opposite Lookahead Enhancement
OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement
Qinglun Li
Miao Zhang
Mengzhu Wang
Quanjun Yin
Li Shen
OODD
FedML
19
0
0
09 Oct 2024
MECFormer: Multi-task Whole Slide Image Classification with Expert
  Consultation Network
MECFormer: Multi-task Whole Slide Image Classification with Expert Consultation Network
Doanh C. Bui
Jin Tae Kwak
DiffM
MedIm
23
0
0
06 Oct 2024
Image First or Text First? Optimising the Sequencing of Modalities in
  Large Language Model Prompting and Reasoning Tasks
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
26
5
0
04 Oct 2024
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from
  Single Images
Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
Bahri Batuhan Bilecen
Ahmet Berke Gokmen
Aysegül Dündar
30
1
0
30 Sep 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry
  Enhancement
Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
Zhehao Huang
Xinwen Cheng
JingHao Zheng
Haoran Wang
Zhengbao He
Tao Li
X. Huang
MU
40
4
0
29 Sep 2024
DetectBERT: Towards Full App-Level Representation Learning to Detect
  Android Malware
DetectBERT: Towards Full App-Level Representation Learning to Detect Android Malware
Tiezhu Sun
N. Daoudi
Kisub Kim
Kevin Allix
Tegawende F. Bissyande
Jacques Klein
17
3
0
29 Aug 2024
Decentralized Federated Learning with Model Caching on Mobile Agents
Decentralized Federated Learning with Model Caching on Mobile Agents
Xiaoyu Wang
Guojun Xiong
Houwei Cao
Jian Li
Yong Liu
FedML
21
1
0
26 Aug 2024
Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification
Rethinking Pre-Trained Feature Extractor Selection in Multiple Instance Learning for Whole Slide Image Classification
Bryan Wong
MunYong Yi
Mun Yong Yi
VLM
50
0
0
02 Aug 2024
Monocular pose estimation of articulated surgical instruments in open
  surgery
Monocular pose estimation of articulated surgical instruments in open surgery
Robert Spektor
Tom Friedman
Itay Or
Gil Bolotin
S. Laufer
30
0
0
16 Jul 2024
Latent Space Imaging
Latent Space Imaging
Matheus Souza
Yidan Zheng
Kaizhang Kang
Yogeshwar Nath Mishra
Qiang Fu
Wolfgang Heidrich
57
0
0
09 Jul 2024
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance
  Learning for Whole Slide Image Classification
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu
Xiwen Chen
Peijie Qiu
Aristeidis Sotiras
Abolfazl Razi
Yalin Wang
32
5
0
04 Jul 2024
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving
  Domain Generalization in Histopathology Images
PathoWAve: A Deep Learning-based Weight Averaging Method for Improving Domain Generalization in Histopathology Images
Parastoo Sotoudeh Sharifi
M. Omair Ahmad
M. N. S. Swamy
MoMe
OOD
36
0
0
21 Jun 2024
Neural network learns low-dimensional polynomials with SGD near the
  information-theoretic limit
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
Jason D. Lee
Kazusato Oko
Taiji Suzuki
Denny Wu
MLT
87
21
0
03 Jun 2024
The Road Less Scheduled
The Road Less Scheduled
Aaron Defazio
Xingyu Yang
Yang
Harsh Mehta
Konstantin Mishchenko
Ahmed Khaled
Ashok Cutkosky
28
45
0
24 May 2024
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
Luca Arnaboldi
Yatin Dandi
Florent Krzakala
Luca Pesce
Ludovic Stephan
61
12
0
24 May 2024
Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase
  Imaging
Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging
Zhuchen Shao
M. Anastasio
Hua Li
DiffM
MedIm
30
1
0
10 May 2024
TimeMIL: Advancing Multivariate Time Series Classification via a
  Time-aware Multiple Instance Learning
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning
Xiwen Chen
Peijie Qiu
Wenhui Zhu
Huayu Li
Hao Wang
Aristeidis Sotiras
Yalin Wang
Abolfazl Razi
AI4TS
29
7
0
06 May 2024
Image segmentation of treated and untreated tumor spheroids by Fully Convolutional Networks
Image segmentation of treated and untreated tumor spheroids by Fully Convolutional Networks
Matthias Streller
S. Michlíková
Willy Ciecior
Katharina Lönnecke
L. Kunz-Schughart
Steffen Lange
Anja Voss-Böhme
49
1
0
02 May 2024
FisheyeDetNet: 360° Surround view Fisheye Camera based Object
  Detection System for Autonomous Driving
FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving
Ganesh Sistu
S. Yogamani
36
0
0
20 Apr 2024
RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for
  Real-world Applications
RaSim: A Range-aware High-fidelity RGB-D Data Simulation Pipeline for Real-world Applications
Xingyu Liu
Chenyangguang Zhang
Gu Wang
Ruida Zhang
Xiangyang Ji
34
1
0
05 Apr 2024
94% on CIFAR-10 in 3.29 Seconds on a Single GPU
94% on CIFAR-10 in 3.29 Seconds on a Single GPU
Keller Jordan
VLM
29
5
0
30 Mar 2024
Revisiting Random Weight Perturbation for Efficiently Improving
  Generalization
Revisiting Random Weight Perturbation for Efficiently Improving Generalization
Tao Li
Qinghua Tao
Weihao Yan
Zehao Lei
Yingwen Wu
Kun Fang
M. He
Xiaolin Huang
AAML
37
5
0
30 Mar 2024
All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating
  Prediction
All-in-One: Heterogeneous Interaction Modeling for Cold-Start Rating Prediction
Shuheng Fang
Kangfei Zhao
Yu Rong
Zhixun Li
Jeffrey Xu Yu
24
0
0
26 Mar 2024
Predicting Perceived Gloss: Do Weak Labels Suffice?
Predicting Perceived Gloss: Do Weak Labels Suffice?
Julia Guerrero-Viu
J. Daniel Subias
Ana Serrano
Katherine R. Storrs
Roland W. Fleming
B. Masiá
Diego F. F. Gutierrez
29
2
0
26 Mar 2024
FedMIL: Federated-Multiple Instance Learning for Video Analysis with
  Optimized DPP Scheduling
FedMIL: Federated-Multiple Instance Learning for Video Analysis with Optimized DPP Scheduling
Ashish Bastola
Hao Wang
Xiwen Chen
Abolfazl Razi
26
0
0
26 Mar 2024
TexTile: A Differentiable Metric for Texture Tileability
TexTile: A Differentiable Metric for Texture Tileability
Carlos Rodriguez-Pardo
Dan Casas
Elena Garces
Jorge López-Moreno
DiffM
28
4
0
19 Mar 2024
Biophysics Informed Pathological Regularisation for Brain Tumour
  Segmentation
Biophysics Informed Pathological Regularisation for Brain Tumour Segmentation
Lipei Zhang
Yanqi Cheng
Lihao Liu
Carola-Bibiane Schönlieb
Angelica I Aviles-Rivero
AI4CE
24
7
0
14 Mar 2024
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
Sayantan Choudhury
N. Tupitsa
Nicolas Loizou
Samuel Horváth
Martin Takáč
Eduard A. Gorbunov
30
1
0
05 Mar 2024
Towards Principled Task Grouping for Multi-Task Learning
Towards Principled Task Grouping for Multi-Task Learning
Chenguang Wang
Xuanhao Pan
Tianshu Yu
31
0
0
23 Feb 2024
Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion
  and Image Attribute Editing
Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing
Hao Li
Mengqi Huang
Lei Zhang
Bo Hu
Yi Liu
Zhendong Mao
DiffM
35
2
0
22 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
30
6
0
14 Feb 2024
Multi-Scale Semantic Segmentation with Modified MBConv Blocks
Multi-Scale Semantic Segmentation with Modified MBConv Blocks
Xi Chen
Yang Cai
Yuan Wu
Bo Xiong
Taesung Park
SSeg
25
0
0
07 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead
  Decoding
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
125
139
0
03 Feb 2024
A Note On Lookahead In Real Life And Computing
A Note On Lookahead In Real Life And Computing
Burle Sharma
Rakesh Mohanty
Sucheta Panda
8
0
0
02 Feb 2024
Making Parametric Anomaly Detection on Tabular Data Non-Parametric Again
Making Parametric Anomaly Detection on Tabular Data Non-Parametric Again
Hugo Thimonier
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
18
1
0
30 Jan 2024
Finetuning Foundation Models for Joint Analysis Optimization
Finetuning Foundation Models for Joint Analysis Optimization
M. Vigl
N. Hartman
L. Heinrich
40
12
0
24 Jan 2024
Enhancing Digital Hologram Reconstruction Using Reverse-Attention Loss
  for Untrained Physics-Driven Deep Learning Models with Uncertain Distance
Enhancing Digital Hologram Reconstruction Using Reverse-Attention Loss for Untrained Physics-Driven Deep Learning Models with Uncertain Distance
Xiwen Chen
Hao Wang
Zhao Zhang
Zhenmin Li
Huayu Li
Tong Ye
Abolfazl Razi
22
1
0
11 Jan 2024
Brain Tumor Segmentation Based on Deep Learning, Attention Mechanisms,
  and Energy-Based Uncertainty Prediction
Brain Tumor Segmentation Based on Deep Learning, Attention Mechanisms, and Energy-Based Uncertainty Prediction
Zachary Schwehr
Sriman Achanta
26
2
0
31 Dec 2023
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network
  Training
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou
Tianyu Pang
Keqin Liu
Charles H. Martin
Michael W. Mahoney
Yaoqing Yang
34
7
0
01 Dec 2023
1234567
Next