ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05482
  4. Cited By
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
    MoMe
ArXivPDFHTML

Papers citing "Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"

50 / 667 papers shown
Title
Modular Deep Learning
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
19
73
0
22 Feb 2023
Deep Active Learning in the Presence of Label Noise: A Survey
Deep Active Learning in the Presence of Label Noise: A Survey
Moseli Motsóehli
Kyungim Baek
NoLa
VLM
21
5
0
22 Feb 2023
Soft Error Reliability Analysis of Vision Transformers
Soft Error Reliability Analysis of Vision Transformers
Xing-xiong Xue
Cheng Liu
Ying Wang
Bing Yang
Tao Luo
L. Zhang
Huawei Li
Xiaowei Li
30
14
0
21 Feb 2023
Seasoning Model Soups for Robustness to Adversarial and Natural
  Distribution Shifts
Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts
Francesco Croce
Sylvestre-Alvise Rebuffi
Evan Shelhamer
Sven Gowal
AAML
26
17
0
20 Feb 2023
Calibrating the Rigged Lottery: Making All Tickets Reliable
Calibrating the Rigged Lottery: Making All Tickets Reliable
Bowen Lei
Ruqi Zhang
Dongkuan Xu
Bani Mallick
UQCV
8
7
0
18 Feb 2023
PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix
  Embedding
PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix Embedding
Zhangyang Gao
Yuqi Hu
Cheng Tan
Stan Z. Li
18
13
0
14 Feb 2023
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained
  Language Models
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
Alexandra Chronopoulou
Matthew E. Peters
Alexander M. Fraser
Jesse Dodge
MoMe
8
65
0
14 Feb 2023
A Modern Look at the Relationship between Sharpness and Generalization
A Modern Look at the Relationship between Sharpness and Generalization
Maksym Andriushchenko
Francesco Croce
Maximilian Müller
Matthias Hein
Nicolas Flammarion
3DH
11
52
0
14 Feb 2023
Calibrating a Deep Neural Network with Its Predecessors
Calibrating a Deep Neural Network with Its Predecessors
Linwei Tao
Minjing Dong
Daochang Liu
Changming Sun
Chang Xu
BDL
UQCV
6
5
0
13 Feb 2023
Sparse Mutation Decompositions: Fine Tuning Deep Neural Networks with
  Subspace Evolution
Sparse Mutation Decompositions: Fine Tuning Deep Neural Networks with Subspace Evolution
Tim Whitaker
L. D. Whitley
14
0
0
12 Feb 2023
Graph Neural Network-Inspired Kernels for Gaussian Processes in
  Semi-Supervised Learning
Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning
Zehao Niu
M. Anitescu
Jing Chen
BDL
11
4
0
12 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
16
49
0
09 Feb 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Maor Ivgi
Oliver Hinder
Y. Carmon
ODL
11
56
0
08 Feb 2023
Exploring the Benefits of Training Expert Language Models over
  Instruction Tuning
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang
Seungone Kim
Seonghyeon Ye
Doyoung Kim
Lajanugen Logeswaran
Moontae Lee
Kyungjae Lee
Minjoon Seo
LRM
ALM
19
79
0
07 Feb 2023
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Zachary Novack
Julian McAuley
Zachary Chase Lipton
Saurabh Garg
VLM
19
78
0
06 Feb 2023
Effective Robustness against Natural Distribution Shifts for Models with
  Different Training Data
Effective Robustness against Natural Distribution Shifts for Models with Different Training Data
Zhouxing Shi
Nicholas Carlini
Ananth Balashankar
Ludwig Schmidt
Cho-Jui Hsieh
Alex Beutel
Yao Qin
OOD
21
9
0
02 Feb 2023
A Comprehensive Survey of Continual Learning: Theory, Method and
  Application
A Comprehensive Survey of Continual Learning: Theory, Method and Application
Liyuan Wang
Xingxing Zhang
Hang Su
Jun Zhu
KELM
CLL
19
585
0
31 Jan 2023
Domain-Generalizable Multiple-Domain Clustering
Domain-Generalizable Multiple-Domain Clustering
Amit Rozner
Barak Battash
Lior Wolf
Ofir Lindenbaum
SSL
OOD
27
7
0
31 Jan 2023
Projected Subnetworks Scale Adaptation
Projected Subnetworks Scale Adaptation
Siddhartha Datta
N. Shadbolt
VLM
CLL
13
0
0
27 Jan 2023
Joint Training of Deep Ensembles Fails Due to Learner Collusion
Joint Training of Deep Ensembles Fails Due to Learner Collusion
Alan Jeffares
Tennison Liu
Jonathan Crabbé
M. Schaar
FedML
29
15
0
26 Jan 2023
Backward Compatibility During Data Updates by Weight Interpolation
Backward Compatibility During Data Updates by Weight Interpolation
Raphael Schumann
Elman Mansimov
Yi-An Lai
Nikolaos Pappas
Xibin Gao
Yi Zhang
11
4
0
25 Jan 2023
Model soups to increase inference without increasing compute time
Model soups to increase inference without increasing compute time
Charles Dansereau
Milo Sobral
Maninder Bhogal
Mehdi Zalai
8
2
0
24 Jan 2023
Why Capsule Neural Networks Do Not Scale: Challenging the Dynamic
  Parse-Tree Assumption
Why Capsule Neural Networks Do Not Scale: Challenging the Dynamic Parse-Tree Assumption
Matthias Mitterreiter
Marcel Koch
Joachim Giesen
Soren Laue
3DPC
MedIm
18
13
0
04 Jan 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution
  Generalization
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
21
80
0
20 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
13
210
0
19 Dec 2022
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
13
7
0
18 Dec 2022
Learning useful representations for shifting tasks and distributions
Learning useful representations for shifting tasks and distributions
Jianyu Zhang
Léon Bottou
OOD
15
13
0
14 Dec 2022
Improving Generalization of Pre-trained Language Models via Stochastic
  Weight Averaging
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Peng Lu
I. Kobyzev
Mehdi Rezagholizadeh
Ahmad Rashid
A. Ghodsi
Philippe Langlais
MoMe
28
11
0
12 Dec 2022
Accelerating Dataset Distillation via Model Augmentation
Accelerating Dataset Distillation via Model Augmentation
Lei Zhang
Jie M. Zhang
Bowen Lei
Subhabrata Mukherjee
Xiang Pan
Bo-Lu Zhao
Caiwen Ding
Y. Li
Dongkuan Xu
DD
10
62
0
12 Dec 2022
Local Neighborhood Features for 3D Classification
Local Neighborhood Features for 3D Classification
Shivanand Venkanna Sheshappanavar
Chandra Kambhamettu
3DPC
12
2
0
09 Dec 2022
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One
  Amplifies Others
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
Zhiheng Li
Ivan Evtimov
Albert Gordo
C. Hazirbas
Tal Hassner
Cristian Canton Ferrer
Chenliang Xu
Mark Ibrahim
18
68
0
09 Dec 2022
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive
  Learning
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Jishnu Mukhoti
Tsung-Yu Lin
Omid Poursaeed
Rui Wang
Ashish Shah
Philip H. S. Torr
Ser-Nam Lim
VLM
24
79
0
09 Dec 2022
Co-training $2^L$ Submodels for Visual Recognition
Co-training 2L2^L2L Submodels for Visual Recognition
Hugo Touvron
Matthieu Cord
Maxime Oquab
Piotr Bojanowski
Jakob Verbeek
Hervé Jégou
VLM
17
9
0
09 Dec 2022
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
31
421
0
08 Dec 2022
MixBoost: Improving the Robustness of Deep Neural Networks by Boosting
  Data Augmentation
MixBoost: Improving the Robustness of Deep Neural Networks by Boosting Data Augmentation
Zhendong Liu
Wenyu Jiang
Min Guo
Chongjun Wang
AAML
8
1
0
08 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
16
52
0
02 Dec 2022
AGRO: Adversarial Discovery of Error-prone groups for Robust
  Optimization
AGRO: Adversarial Discovery of Error-prone groups for Robust Optimization
Bhargavi Paranjape
Pradeep Dasigi
Vivek Srikumar
Luke Zettlemoyer
Hannaneh Hajishirzi
17
7
0
02 Dec 2022
Finetune like you pretrain: Improved finetuning of zero-shot vision
  models
Finetune like you pretrain: Improved finetuning of zero-shot vision models
Sachin Goyal
Ananya Kumar
Sankalp Garg
Zico Kolter
Aditi Raghunathan
CLIP
VLM
18
136
0
01 Dec 2022
Context-Aware Robust Fine-Tuning
Context-Aware Robust Fine-Tuning
Xiaofeng Mao
YueFeng Chen
Xiaojun Jia
Rong Zhang
Hui Xue
Zhao Li
VLM
CLIP
14
23
0
29 Nov 2022
Neural Architecture for Online Ensemble Continual Learning
Neural Architecture for Online Ensemble Continual Learning
Mateusz Wójcik
Witold Ko'sciukiewicz
Tomasz Kajdanowicz
Adam Gonczarek
CLL
10
1
0
27 Nov 2022
1st Place Solution to NeurIPS 2022 Challenge on Visual Domain Adaptation
1st Place Solution to NeurIPS 2022 Challenge on Visual Domain Adaptation
Daehan Kim
Min-seok Seo
Youngjin Jeon
Dong-Geol Choi
OOD
20
1
0
26 Nov 2022
Neural Dependencies Emerging from Learning Massive Categories
Neural Dependencies Emerging from Learning Massive Categories
Ruili Feng
Kecheng Zheng
Kai Zhu
Yujun Shen
Jian Zhao
Yukun Huang
Deli Zhao
Jingren Zhou
Michael I. Jordan
Zhengjun Zha
UQCV
14
0
0
21 Nov 2022
Weighted Ensemble Self-Supervised Learning
Weighted Ensemble Self-Supervised Learning
Yangjun Ruan
Saurabh Singh
Warren Morningstar
Alexander A. Alemi
Sergey Ioffe
Ian S. Fischer
Joshua V. Dillon
FedML
14
15
0
18 Nov 2022
How to Fine-Tune Vision Models with SGD
How to Fine-Tune Vision Models with SGD
Ananya Kumar
Ruoqi Shen
Sébastien Bubeck
Suriya Gunasekar
VLM
6
28
0
17 Nov 2022
Mechanistic Mode Connectivity
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
19
45
0
15 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
30
94
0
15 Nov 2022
FedTune: A Deep Dive into Efficient Federated Fine-Tuning with
  Pre-trained Transformers
FedTune: A Deep Dive into Efficient Federated Fine-Tuning with Pre-trained Transformers
Jinyu Chen
Wenchao Xu
Song Guo
Junxiao Wang
Jie M. Zhang
Haozhao Wang
FedML
15
32
0
15 Nov 2022
Towards Better Out-of-Distribution Generalization of Neural Algorithmic
  Reasoning Tasks
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks
Sadegh Mahdavi
Kevin Swersky
Thomas Kipf
Milad Hashemi
Christos Thrampoulidis
Renjie Liao
LRM
OOD
NAI
40
26
0
01 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
17
4
0
01 Nov 2022
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Yaqing Wang
Sahaj Agarwal
Subhabrata Mukherjee
Xiaodong Liu
Jing Gao
Ahmed Hassan Awadallah
Jianfeng Gao
MoE
9
116
0
31 Oct 2022
Previous
123...11121314
Next