ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.00661
  4. Cited By
When Do Flat Minima Optimizers Work?

When Do Flat Minima Optimizers Work?

1 February 2022
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
    ODL
ArXivPDFHTML

Papers citing "When Do Flat Minima Optimizers Work?"

8 / 58 papers shown
Title
Causal Effect Inference for Structured Treatments
Causal Effect Inference for Structured Treatments
Jean Kaddour
Yuchen Zhu
Qi Liu
Matt J. Kusner
Ricardo M. A. Silva
CML
177
50
0
03 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
271
2,603
0
04 May 2021
SWAD: Domain Generalization by Seeking Flat Minima
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
216
423
0
17 Feb 2021
There Are Many Consistent Explanations of Unlabeled Data: Why You Should
  Average
There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average
Ben Athiwaratkun
Marc Finzi
Pavel Izmailov
A. Wilson
202
243
0
14 Jun 2018
Bilevel Programming for Hyperparameter Optimization and Meta-Learning
Bilevel Programming for Hyperparameter Optimization and Meta-Learning
Luca Franceschi
P. Frasconi
Saverio Salzo
Riccardo Grazzi
Massimiliano Pontil
110
716
0
13 Jun 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,661
0
05 Dec 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
296
2,890
0
15 Sep 2016
Previous
12