Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.05956
Cited By
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
12 December 2022
Peng Lu
I. Kobyzev
Mehdi Rezagholizadeh
Ahmad Rashid
A. Ghodsi
Philippe Langlais
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging"
8 / 8 papers shown
Title
Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging
Aarne Talman
H. Çelikkanat
Sami Virpioja
Markus Heinonen
Jörg Tiedemann
BDL
UQCV
11
7
0
10 Apr 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
13
49
0
09 Feb 2023
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurélien Lucchi
50
44
0
06 Feb 2022
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
119
82
0
16 Oct 2021
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
216
338
0
17 Feb 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,696
0
15 Sep 2016
1