Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.09240
Cited By
Switch EMA: A Free Lunch for Better Flatness and Sharpness
14 February 2024
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
Weiyang Jin
Di Wu
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Switch EMA: A Free Lunch for Better Flatness and Sharpness"
15 / 15 papers shown
Title
PANDORA: Diffusion Policy Learning for Dexterous Robotic Piano Playing
Yanjia Huang
Renjie Li
Zhengzhong Tu
VGen
56
0
0
17 Mar 2025
Understanding Flatness in Generative Models: Its Role and Benefits
Taehwan Lee
Kyeongkook Seo
Jaejun Yoo
Sung Whan Yoon
DiffM
51
0
0
14 Mar 2025
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
Ibrahim Fayad
Max Zimmer
Martin Schwartz
P. Ciais
Fabian Gieseke
Gabriel Belouze
Sarah Brood
A. D. Truchis
Alexandre d’Aspremont
AI4TS
38
0
0
24 Feb 2025
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
83
0
0
10 Feb 2025
Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning
Cheng Tan
Jingxuan Wei
Linzhuang Sun
Zhangyang Gao
Siyuan Li
Bihui Yu
Ruifeng Guo
Stan Z. Li
ReLM
LRM
3DV
64
6
0
31 May 2024
Trainable Weight Averaging: Accelerating Training and Improving Generalization
Tao Li
Zhehao Huang
Yingwen Wu
Zhengbao He
Qinghua Tao
X. Huang
Chih-Jen Lin
MoMe
50
3
0
26 May 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Joey Tianyi Zhou
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
102
132
0
07 Oct 2021
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
Sachin Mehta
Mohammad Rastegari
ViT
189
1,200
0
05 Oct 2021
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
207
484
0
01 Oct 2021
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
172
474
0
12 Aug 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,592
0
04 May 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
238
3,359
0
09 Mar 2020
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
L. V. D. van der Maaten
Kilian Q. Weinberger
PINN
3DV
247
36,237
0
25 Aug 2016
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
201
7,884
0
13 Jun 2015
1