ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.05407
  4. Cited By
Averaging Weights Leads to Wider Optima and Better Generalization

Averaging Weights Leads to Wider Optima and Better Generalization

14 March 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
    FedML
    MoMe
ArXivPDFHTML

Papers citing "Averaging Weights Leads to Wider Optima and Better Generalization"

50 / 305 papers shown
Title
Flatness Improves Backbone Generalisation in Few-shot Classification
Flatness Improves Backbone Generalisation in Few-shot Classification
Rui Li
Martin Trapp
Marcus Klasson
Arno Solin
41
0
0
11 Apr 2024
Investigation of Energy-efficient AI Model Architectures and Compression
  Techniques for "Green" Fetal Brain Segmentation
Investigation of Energy-efficient AI Model Architectures and Compression Techniques for "Green" Fetal Brain Segmentation
Szymon Mazurek
M. Pytlarz
Sylwia Malec
A. Crimi
31
0
0
03 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
Yu Wang
MoMe
55
4
0
02 Apr 2024
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real
  World
Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World
Bowen Lei
Dongkuan Xu
Ruqi Zhang
Bani Mallick
UQCV
31
0
0
29 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
82
77
0
20 Mar 2024
HeteroSwitch: Characterizing and Taming System-Induced Data
  Heterogeneity in Federated Learning
HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning
Gyudong Kim
Mehdi Ghasemi
Soroush Heidari
Seungryong Kim
Young Geun Kim
S. Vrudhula
Carole-Jean Wu
27
1
0
07 Mar 2024
Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Fei Zhu
Xu-Yao Zhang
Zhen Cheng
Cheng-Lin Liu
UQCV
44
10
0
05 Mar 2024
Fine-tuning with Very Large Dropout
Fine-tuning with Very Large Dropout
Jianyu Zhang
Léon Bottou
37
1
0
01 Mar 2024
Adversarial Example Soups: Improving Transferability and Stealthiness for Free
Adversarial Example Soups: Improving Transferability and Stealthiness for Free
Bo Yang
Hengwei Zhang
Jin-dong Wang
Yulong Yang
Chenhao Lin
Chao Shen
Zhengyu Zhao
SILM
AAML
59
1
0
27 Feb 2024
Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement
Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement
Aaqib Saeed
Dimitris Spathis
Jungwoo Oh
Edward Choi
Ali Etemad
NoLa
19
2
0
25 Jan 2024
Doubly Perturbed Task Free Continual Learning
Doubly Perturbed Task Free Continual Learning
Byung Hyun Lee
Min-hwan Oh
Se Young Chun
11
3
0
20 Dec 2023
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition
Xiaomeng Yang
Zhi Qiao
Yu Zhou
DiffM
57
1
0
19 Dec 2023
Open Domain Generalization with a Single Network by Regularization
  Exploiting Pre-trained Features
Open Domain Generalization with a Single Network by Regularization Exploiting Pre-trained Features
Inseop Chung
Kiyoon Yoo
Nojun Kwak
VLM
14
0
0
08 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
28
153
0
05 Dec 2023
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for
  360 Room Layout Reconstruction
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction
Cheng Sun
Wei-En Tai
Yu-Lin Shih
Kuan-Wei Chen
Yong-Jing Syu
Kent Selwyn The
Yu-Chiang Frank Wang
Hwann-Tzong Chen
3DV
30
2
0
30 Nov 2023
Efficient Stitchable Task Adaptation
Efficient Stitchable Task Adaptation
Haoyu He
Zizheng Pan
Jing Liu
Jianfei Cai
Bohan Zhuang
24
3
0
29 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
39
1
0
29 Nov 2023
Parameter Exchange for Robust Dynamic Domain Generalization
Parameter Exchange for Robust Dynamic Domain Generalization
Luojun Lin
Zhifeng Shen
Zhishu Sun
Yuanlong Yu
Lei Zhang
Weijie Chen
OOD
17
6
0
23 Nov 2023
Language and Task Arithmetic with Parameter-Efficient Layers for
  Zero-Shot Summarization
Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization
Alexandra Chronopoulou
Jonas Pfeiffer
Joshua Maynez
Xinyi Wang
Sebastian Ruder
Priyanka Agrawal
MoMe
24
14
0
15 Nov 2023
Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from
  a Minimax Game Perspective
Balance, Imbalance, and Rebalance: Understanding Robust Overfitting from a Minimax Game Perspective
Yifei Wang
Liangchen Li
Jiansheng Yang
Zhouchen Lin
Yisen Wang
23
11
0
30 Oct 2023
Model Merging by Uncertainty-Based Gradient Matching
Model Merging by Uncertainty-Based Gradient Matching
Nico Daheim
Thomas Möllenhoff
E. Ponti
Iryna Gurevych
Mohammad Emtiyaz Khan
MoMe
FedML
32
43
0
19 Oct 2023
Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data
Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data
Mouad El Bouchattaoui
Myriam Tami
Benoit Lepetit
P. Cournède
CML
OOD
58
2
0
16 Oct 2023
On the Over-Memorization During Natural, Robust and Catastrophic
  Overfitting
On the Over-Memorization During Natural, Robust and Catastrophic Overfitting
Runqi Lin
Chaojian Yu
Bo Han
Tongliang Liu
17
7
0
13 Oct 2023
Weight Averaging Improves Knowledge Distillation under Domain Shift
Weight Averaging Improves Knowledge Distillation under Domain Shift
Valeriy Berezovskiy
Nikita Morozov
MoMe
19
1
0
20 Sep 2023
Uncertainty Estimation of Transformers' Predictions via Topological
  Analysis of the Attention Matrices
Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices
Elizaveta Kostenok
D. Cherniavskii
Alexey Zaytsev
41
5
0
22 Aug 2023
Jumping through Local Minima: Quantization in the Loss Landscape of
  Vision Transformers
Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
MQ
33
16
0
21 Aug 2023
Benchmarking Scalable Epistemic Uncertainty Quantification in Organ
  Segmentation
Benchmarking Scalable Epistemic Uncertainty Quantification in Organ Segmentation
Jadie Adams
Shireen Elhabian
UQCV
15
5
0
15 Aug 2023
Lookbehind-SAM: k steps back, 1 step forward
Lookbehind-SAM: k steps back, 1 step forward
Gonçalo Mordido
Pranshu Malviya
A. Baratin
Sarath Chandar
AAML
40
1
0
31 Jul 2023
Cross-dimensional transfer learning in medical image segmentation with
  deep learning
Cross-dimensional transfer learning in medical image segmentation with deep learning
Hicham Messaoudi
Ahror Belaid
Douraied BEN SALEM
Pierre-Henri Conze
MedIm
22
23
0
29 Jul 2023
FedSoup: Improving Generalization and Personalization in Federated
  Learning via Selective Model Interpolation
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation
Minghui Chen
Meirui Jiang
Qianming Dou
Zehua Wang
Xiaoxiao Li
FedML
30
15
0
20 Jul 2023
Layer-wise Linear Mode Connectivity
Layer-wise Linear Mode Connectivity
Linara Adilova
Maksym Andriushchenko
Michael Kamp
Asja Fischer
Martin Jaggi
FedML
FAtt
MoMe
26
15
0
13 Jul 2023
Concurrent ischemic lesion age estimation and segmentation of CT brain
  using a Transformer-based network
Concurrent ischemic lesion age estimation and segmentation of CT brain using a Transformer-based network
A. Marcus
P. Bentley
Daniel Rueckert
MedIm
8
9
0
21 Jun 2023
Confidence-Based Model Selection: When to Take Shortcuts for
  Subpopulation Shifts
Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts
Annie S. Chen
Yoonho Lee
Amrith Rajagopal Setlur
Sergey Levine
Chelsea Finn
OOD
14
5
0
19 Jun 2023
A Boosted Model Ensembling Approach to Ball Action Spotting in Videos:
  The Runner-Up Solution to CVPR'23 SoccerNet Challenge
A Boosted Model Ensembling Approach to Ball Action Spotting in Videos: The Runner-Up Solution to CVPR'23 SoccerNet Challenge
Luping Wang
Hao Guo
B. Liu
16
3
0
09 Jun 2023
Quantifying Representation Reliability in Self-Supervised Learning
  Models
Quantifying Representation Reliability in Self-Supervised Learning Models
Young-Jin Park
Hao Wang
Shervin Ardeshir
Navid Azizan
SSL
UQCV
21
3
0
31 May 2023
VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning
  Challenges
VIPriors 3: Visual Inductive Priors for Data-Efficient Deep Learning Challenges
Robert-Jan Bruintjes
A. Lengyel
Marcos Baptista-Rios
O. Kayhan
Davide Zambrano
Nergis Tomen
J. C. V. Gemert
18
9
0
31 May 2023
HyperTime: Hyperparameter Optimization for Combating Temporal
  Distribution Shifts
HyperTime: Hyperparameter Optimization for Combating Temporal Distribution Shifts
Shaokun Zhang
Yiran Wu
Zhonghua Zheng
Qingyun Wu
Chi Wang
OOD
43
7
0
28 May 2023
How to escape sharp minima with random perturbations
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
24
6
0
25 May 2023
POEM: Polarization of Embeddings for Domain-Invariant Representations
POEM: Polarization of Embeddings for Domain-Invariant Representations
Sang-Yeong Jo
Sung Whan Yoon
19
8
0
22 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained
  Models
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
37
103
0
22 May 2023
Measuring and Mitigating Local Instability in Deep Neural Networks
Measuring and Mitigating Local Instability in Deep Neural Networks
Arghya Datta
Subhrangshu Nandi
Jingcheng Xu
Greg Ver Steeg
He Xie
Anoop Kumar
Aram Galstyan
15
3
0
18 May 2023
SRIL: Selective Regularization for Class-Incremental Learning
SRIL: Selective Regularization for Class-Incremental Learning
Jisu Han
Jaemin Na
Wonjun Hwang
CLL
10
0
0
09 May 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
79
31
0
28 Apr 2023
Advancing Ischemic Stroke Diagnosis: A Novel Two-Stage Approach for
  Blood Clot Origin Identification
Advancing Ischemic Stroke Diagnosis: A Novel Two-Stage Approach for Blood Clot Origin Identification
Koushik Sivarama Krishnan
P. J. J. Nikesh
Swathi Gnanasekar
Karthik Sivarama Krishnan
19
0
0
26 Apr 2023
Do deep neural networks have an inbuilt Occam's razor?
Do deep neural networks have an inbuilt Occam's razor?
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
19
15
0
13 Apr 2023
Uncertainty-Aware Natural Language Inference with Stochastic Weight
  Averaging
Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging
Aarne Talman
H. Çelikkanat
Sami Virpioja
Markus Heinonen
Jörg Tiedemann
BDL
UQCV
24
7
0
10 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
28
40
0
07 Apr 2023
Randomized Adversarial Style Perturbations for Domain Generalization
Randomized Adversarial Style Perturbations for Domain Generalization
Taehoon Kim
Bohyung Han
AAML
30
2
0
04 Apr 2023
Generalization Matters: Loss Minima Flattening via Parameter
  Hybridization for Efficient Online Knowledge Distillation
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation
Tianli Zhang
Mengqi Xue
Jiangtao Zhang
Haofei Zhang
Yu Wang
Lechao Cheng
Jie Song
Mingli Song
28
5
0
26 Mar 2023
Generalist: Decoupling Natural and Robust Generalization
Generalist: Decoupling Natural and Robust Generalization
Hongjun Wang
Yisen Wang
OOD
AAML
31
14
0
24 Mar 2023
Previous
1234567
Next