Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.05482
Cited By
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
10 March 2022
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
Ari S. Morcos
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time"
50 / 667 papers shown
Title
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
19
73
0
22 Feb 2023
Deep Active Learning in the Presence of Label Noise: A Survey
Moseli Motsóehli
Kyungim Baek
NoLa
VLM
21
5
0
22 Feb 2023
Soft Error Reliability Analysis of Vision Transformers
Xing-xiong Xue
Cheng Liu
Ying Wang
Bing Yang
Tao Luo
L. Zhang
Huawei Li
Xiaowei Li
30
14
0
21 Feb 2023
Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts
Francesco Croce
Sylvestre-Alvise Rebuffi
Evan Shelhamer
Sven Gowal
AAML
26
17
0
20 Feb 2023
Calibrating the Rigged Lottery: Making All Tickets Reliable
Bowen Lei
Ruqi Zhang
Dongkuan Xu
Bani Mallick
UQCV
8
7
0
18 Feb 2023
PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix Embedding
Zhangyang Gao
Yuqi Hu
Cheng Tan
Stan Z. Li
18
13
0
14 Feb 2023
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
Alexandra Chronopoulou
Matthew E. Peters
Alexander M. Fraser
Jesse Dodge
MoMe
8
65
0
14 Feb 2023
A Modern Look at the Relationship between Sharpness and Generalization
Maksym Andriushchenko
Francesco Croce
Maximilian Müller
Matthias Hein
Nicolas Flammarion
3DH
11
52
0
14 Feb 2023
Calibrating a Deep Neural Network with Its Predecessors
Linwei Tao
Minjing Dong
Daochang Liu
Changming Sun
Chang Xu
BDL
UQCV
6
5
0
13 Feb 2023
Sparse Mutation Decompositions: Fine Tuning Deep Neural Networks with Subspace Evolution
Tim Whitaker
L. D. Whitley
14
0
0
12 Feb 2023
Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning
Zehao Niu
M. Anitescu
Jing Chen
BDL
11
4
0
12 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
16
49
0
09 Feb 2023
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Maor Ivgi
Oliver Hinder
Y. Carmon
ODL
11
56
0
08 Feb 2023
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Joel Jang
Seungone Kim
Seonghyeon Ye
Doyoung Kim
Lajanugen Logeswaran
Moontae Lee
Kyungjae Lee
Minjoon Seo
LRM
ALM
19
79
0
07 Feb 2023
CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets
Zachary Novack
Julian McAuley
Zachary Chase Lipton
Saurabh Garg
VLM
19
78
0
06 Feb 2023
Effective Robustness against Natural Distribution Shifts for Models with Different Training Data
Zhouxing Shi
Nicholas Carlini
Ananth Balashankar
Ludwig Schmidt
Cho-Jui Hsieh
Alex Beutel
Yao Qin
OOD
21
9
0
02 Feb 2023
A Comprehensive Survey of Continual Learning: Theory, Method and Application
Liyuan Wang
Xingxing Zhang
Hang Su
Jun Zhu
KELM
CLL
19
585
0
31 Jan 2023
Domain-Generalizable Multiple-Domain Clustering
Amit Rozner
Barak Battash
Lior Wolf
Ofir Lindenbaum
SSL
OOD
27
7
0
31 Jan 2023
Projected Subnetworks Scale Adaptation
Siddhartha Datta
N. Shadbolt
VLM
CLL
13
0
0
27 Jan 2023
Joint Training of Deep Ensembles Fails Due to Learner Collusion
Alan Jeffares
Tennison Liu
Jonathan Crabbé
M. Schaar
FedML
29
15
0
26 Jan 2023
Backward Compatibility During Data Updates by Weight Interpolation
Raphael Schumann
Elman Mansimov
Yi-An Lai
Nikolaos Pappas
Xibin Gao
Yi Zhang
11
4
0
25 Jan 2023
Model soups to increase inference without increasing compute time
Charles Dansereau
Milo Sobral
Maninder Bhogal
Mehdi Zalai
8
2
0
24 Jan 2023
Why Capsule Neural Networks Do Not Scale: Challenging the Dynamic Parse-Tree Assumption
Matthias Mitterreiter
Marcel Koch
Joachim Giesen
Soren Laue
3DPC
MedIm
18
13
0
04 Jan 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
21
80
0
20 Dec 2022
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
13
210
0
19 Dec 2022
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
13
7
0
18 Dec 2022
Learning useful representations for shifting tasks and distributions
Jianyu Zhang
Léon Bottou
OOD
15
13
0
14 Dec 2022
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Peng Lu
I. Kobyzev
Mehdi Rezagholizadeh
Ahmad Rashid
A. Ghodsi
Philippe Langlais
MoMe
28
11
0
12 Dec 2022
Accelerating Dataset Distillation via Model Augmentation
Lei Zhang
Jie M. Zhang
Bowen Lei
Subhabrata Mukherjee
Xiang Pan
Bo-Lu Zhao
Caiwen Ding
Y. Li
Dongkuan Xu
DD
10
62
0
12 Dec 2022
Local Neighborhood Features for 3D Classification
Shivanand Venkanna Sheshappanavar
Chandra Kambhamettu
3DPC
12
2
0
09 Dec 2022
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
Zhiheng Li
Ivan Evtimov
Albert Gordo
C. Hazirbas
Tal Hassner
Cristian Canton Ferrer
Chenliang Xu
Mark Ibrahim
18
68
0
09 Dec 2022
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Jishnu Mukhoti
Tsung-Yu Lin
Omid Poursaeed
Rui Wang
Ashish Shah
Philip H. S. Torr
Ser-Nam Lim
VLM
24
79
0
09 Dec 2022
Co-training
2
L
2^L
2
L
Submodels for Visual Recognition
Hugo Touvron
Matthieu Cord
Maxime Oquab
Piotr Bojanowski
Jakob Verbeek
Hervé Jégou
VLM
17
9
0
09 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
31
421
0
08 Dec 2022
MixBoost: Improving the Robustness of Deep Neural Networks by Boosting Data Augmentation
Zhendong Liu
Wenyu Jiang
Min Guo
Chongjun Wang
AAML
8
1
0
08 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
16
52
0
02 Dec 2022
AGRO: Adversarial Discovery of Error-prone groups for Robust Optimization
Bhargavi Paranjape
Pradeep Dasigi
Vivek Srikumar
Luke Zettlemoyer
Hannaneh Hajishirzi
17
7
0
02 Dec 2022
Finetune like you pretrain: Improved finetuning of zero-shot vision models
Sachin Goyal
Ananya Kumar
Sankalp Garg
Zico Kolter
Aditi Raghunathan
CLIP
VLM
18
136
0
01 Dec 2022
Context-Aware Robust Fine-Tuning
Xiaofeng Mao
YueFeng Chen
Xiaojun Jia
Rong Zhang
Hui Xue
Zhao Li
VLM
CLIP
14
23
0
29 Nov 2022
Neural Architecture for Online Ensemble Continual Learning
Mateusz Wójcik
Witold Ko'sciukiewicz
Tomasz Kajdanowicz
Adam Gonczarek
CLL
10
1
0
27 Nov 2022
1st Place Solution to NeurIPS 2022 Challenge on Visual Domain Adaptation
Daehan Kim
Min-seok Seo
Youngjin Jeon
Dong-Geol Choi
OOD
20
1
0
26 Nov 2022
Neural Dependencies Emerging from Learning Massive Categories
Ruili Feng
Kecheng Zheng
Kai Zhu
Yujun Shen
Jian Zhao
Yukun Huang
Deli Zhao
Jingren Zhou
Michael I. Jordan
Zhengjun Zha
UQCV
14
0
0
21 Nov 2022
Weighted Ensemble Self-Supervised Learning
Yangjun Ruan
Saurabh Singh
Warren Morningstar
Alexander A. Alemi
Sergey Ioffe
Ian S. Fischer
Joshua V. Dillon
FedML
14
15
0
18 Nov 2022
How to Fine-Tune Vision Models with SGD
Ananya Kumar
Ruoqi Shen
Sébastien Bubeck
Suriya Gunasekar
VLM
6
28
0
17 Nov 2022
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
19
45
0
15 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
30
94
0
15 Nov 2022
FedTune: A Deep Dive into Efficient Federated Fine-Tuning with Pre-trained Transformers
Jinyu Chen
Wenchao Xu
Song Guo
Junxiao Wang
Jie M. Zhang
Haozhao Wang
FedML
15
32
0
15 Nov 2022
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks
Sadegh Mahdavi
Kevin Swersky
Thomas Kipf
Milad Hashemi
Christos Thrampoulidis
Renjie Liao
LRM
OOD
NAI
40
26
0
01 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
17
4
0
01 Nov 2022
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Yaqing Wang
Sahaj Agarwal
Subhabrata Mukherjee
Xiaodong Liu
Jing Gao
Ahmed Hassan Awadallah
Jianfeng Gao
MoE
9
116
0
31 Oct 2022
Previous
1
2
3
...
11
12
13
14
Next