ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.13239
  4. Cited By
Scalable Transfer Learning with Expert Models

Scalable Transfer Learning with Expert Models

28 September 2020
J. Puigcerver
C. Riquelme
Basil Mustafa
Cédric Renggli
André Susano Pinto
Sylvain Gelly
Daniel Keysers
N. Houlsby
ArXivPDFHTML

Papers citing "Scalable Transfer Learning with Expert Models"

50 / 52 papers shown
Title
MoMa: A Modular Deep Learning Framework for Material Property Prediction
MoMa: A Modular Deep Learning Framework for Material Property Prediction
Botian Wang
Y. Ouyang
Yaohui Li
Y. Wang
Haorui Cui
Jianbing Zhang
Xiaonan Wang
Wei-Ying Ma
Hao Zhou
44
0
0
21 Feb 2025
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
Ghada Sokar
J. Obando-Ceron
Aaron C. Courville
Hugo Larochelle
Pablo Samuel Castro
MoE
114
2
0
02 Oct 2024
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language
  Models
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
Keivan Alizadeh
Iman Mirzadeh
Hooman Shahrokhi
Dmitry Belenko
Frank Sun
Minsik Cho
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
MoE
31
1
0
01 Oct 2024
Leveraging Estimated Transferability Over Human Intuition for Model
  Selection in Text Ranking
Leveraging Estimated Transferability Over Human Intuition for Model Selection in Text Ranking
Jun Bai
Zhuofan Chen
Zhenzi Li
Hanhua Hong
Jianfei Zhang
Chen Li
Chenghua Lin
Wenge Rong
24
0
0
24 Sep 2024
Data Selection for Transfer Unlearning
Data Selection for Transfer Unlearning
N. Sepahvand
Vincent Dumoulin
Eleni Triantafillou
Gintare Karolina Dziugaite
MU
37
4
0
16 May 2024
Conditional computation in neural networks: principles and research
  trends
Conditional computation in neural networks: principles and research trends
Simone Scardapane
Alessandro Baiocchi
Alessio Devoto
V. Marsocci
Pasquale Minervini
Jary Pomponi
34
1
0
12 Mar 2024
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Mixtures of Experts Unlock Parameter Scaling for Deep RL
J. Obando-Ceron
Ghada Sokar
Timon Willi
Clare Lyle
Jesse Farebrother
Jakob N. Foerster
Gintare Karolina Dziugaite
Doina Precup
Pablo Samuel Castro
50
29
0
13 Feb 2024
Task-customized Masked AutoEncoder via Mixture of Cluster-conditional
  Experts
Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts
Zhili Liu
Kai Chen
Jianhua Han
Lanqing Hong
Hang Xu
Zhenguo Li
James T. Kwok
MoE
109
24
0
08 Feb 2024
On Parameter Estimation in Deviated Gaussian Mixture of Experts
On Parameter Estimation in Deviated Gaussian Mixture of Experts
Huy Nguyen
Khai Nguyen
Nhat Ho
44
0
0
07 Feb 2024
Superfiltering: Weak-to-Strong Data Filtering for Fast
  Instruction-Tuning
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
Ming Li
Yong Zhang
Shwai He
Zhitao Li
Hongyu Zhao
Jianzong Wang
Ning Cheng
Tianyi Zhou
27
64
0
01 Feb 2024
LocMoE: A Low-Overhead MoE for Large Language Model Training
LocMoE: A Low-Overhead MoE for Large Language Model Training
Jing Li
Zhijie Sun
Xuan He
Li Zeng
Yi Lin
Entong Li
Binfan Zheng
Rongqian Zhao
Xin Chen
MoE
30
11
0
25 Jan 2024
How to Determine the Most Powerful Pre-trained Language Model without
  Brute Force Fine-tuning? An Empirical Survey
How to Determine the Most Powerful Pre-trained Language Model without Brute Force Fine-tuning? An Empirical Survey
Jun Bai
Xiaofeng Zhang
Chen Li
Hanhua Hong
Xi Xu
Chenghua Lin
Wenge Rong
23
10
0
08 Dec 2023
Direct Neural Machine Translation with Task-level Mixture of Experts
  models
Direct Neural Machine Translation with Task-level Mixture of Experts models
Isidora Chara Tourni
Subhajit Naskar
MoE
19
0
0
18 Oct 2023
FedJETs: Efficient Just-In-Time Personalization with Federated Mixture
  of Experts
FedJETs: Efficient Just-In-Time Personalization with Federated Mixture of Experts
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Robert Sim
Anastasios Kyrillidis
Dimitrios Dimitriadis
FedML
MoE
24
6
0
14 Jun 2023
Brainformers: Trading Simplicity for Efficiency
Brainformers: Trading Simplicity for Efficiency
Yan-Quan Zhou
Nan Du
Yanping Huang
Daiyi Peng
Chang Lan
...
Zhifeng Chen
Quoc V. Le
Claire Cui
J.H.J. Laundon
J. Dean
MoE
8
23
0
29 May 2023
Towards Convergence Rates for Parameter Estimation in Gaussian-gated
  Mixture of Experts
Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts
Huy Nguyen
TrungTin Nguyen
Khai Nguyen
Nhat Ho
MoE
43
12
0
12 May 2023
Revisiting Single-gated Mixtures of Experts
Revisiting Single-gated Mixtures of Experts
Amelie Royer
I. Karmanov
Andrii Skliar
B. Bejnordi
Tijmen Blankevoort
MoE
MoMe
25
6
0
11 Apr 2023
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning
Haoyu He
Jianfei Cai
Jing Zhang
Dacheng Tao
Bohan Zhuang
VPVLM
14
50
0
15 Mar 2023
Memory-efficient NLLB-200: Language-specific Expert Pruning of a
  Massively Multilingual Machine Translation Model
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model
Yeskendir Koishekenov
Alexandre Berard
Vassilina Nikoulina
MoE
30
29
0
19 Dec 2022
CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning
  with Demonstrations
CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations
Kai Yan
A. Schwing
Yu-xiong Wang
OffRL
26
2
0
18 Oct 2022
Content-Based Search for Deep Generative Models
Content-Based Search for Deep Generative Models
Daohan Lu
Sheng-Yu Wang
Nupur Kumari
Rohan Agarwal
Mia Tang
David Bau
Jun-Yan Zhu
DiffM
SyDa
32
5
0
06 Oct 2022
Granularity-aware Adaptation for Image Retrieval over Multiple Tasks
Granularity-aware Adaptation for Image Retrieval over Multiple Tasks
Jon Almazán
ByungSoo Ko
Geonmo Gu
Diane Larlus
Yannis Kalantidis
ObjD
VLM
28
7
0
05 Oct 2022
A Review of Sparse Expert Models in Deep Learning
A Review of Sparse Expert Models in Deep Learning
W. Fedus
J. Dean
Barret Zoph
MoE
13
144
0
04 Sep 2022
SphereFed: Hyperspherical Federated Learning
SphereFed: Hyperspherical Federated Learning
Xin Dong
S. Zhang
Ang Li
H. T. Kung
FedML
33
19
0
19 Jul 2022
Modularized Transfer Learning with Multiple Knowledge Graphs for
  Zero-shot Commonsense Reasoning
Modularized Transfer Learning with Multiple Knowledge Graphs for Zero-shot Commonsense Reasoning
Yu Jin Kim
Beong-woo Kwak
Youngwook Kim
Reinald Kim Amplayo
Seung-won Hwang
Jinyoung Yeo
LRM
19
12
0
08 Jun 2022
Deep transfer learning for image classification: a survey
Deep transfer learning for image classification: a survey
J. Plested
Tom Gedeon
OOD
22
36
0
20 May 2022
SHiFT: An Efficient, Flexible Search Engine for Transfer Learning
SHiFT: An Efficient, Flexible Search Engine for Transfer Learning
Cédric Renggli
Xiaozhe Yao
Luka Kolar
Luka Rimanic
Ana Klimovic
Ce Zhang
OOD
32
4
0
04 Apr 2022
Proper Reuse of Image Classification Features Improves Object Detection
Proper Reuse of Image Classification Features Improves Object Detection
C. N. Vasconcelos
Vighnesh Birodkar
Vincent Dumoulin
VLM
17
32
0
01 Apr 2022
Memory Efficient Continual Learning with Transformers
Memory Efficient Continual Learning with Transformers
B. Ermiş
Giovanni Zappella
Martin Wistuba
Aditya Rawal
Cédric Archambeau
CLL
21
42
0
09 Mar 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
151
327
0
18 Feb 2022
Head2Toe: Utilizing Intermediate Representations for Better Transfer
  Learning
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning
Utku Evci
Vincent Dumoulin
Hugo Larochelle
Michael C. Mozer
23
83
0
10 Jan 2022
Ensembling Off-the-shelf Models for GAN Training
Ensembling Off-the-shelf Models for GAN Training
Nupur Kumari
Richard Y. Zhang
Eli Shechtman
Jun-Yan Zhu
19
86
0
16 Dec 2021
Scalable Diverse Model Selection for Accessible Transfer Learning
Scalable Diverse Model Selection for Accessible Transfer Learning
Daniel Bolya
Rohit Mittapalli
Judy Hoffman
OODD
27
41
0
12 Nov 2021
Exploring the Limits of Large Scale Pre-training
Exploring the Limits of Large Scale Pre-training
Samira Abnar
Mostafa Dehghani
Behnam Neyshabur
Hanie Sedghi
AI4CE
55
114
0
05 Oct 2021
Representation Consolidation for Training Expert Students
Representation Consolidation for Training Expert Students
Zhizhong Li
Avinash Ravichandran
Charless C. Fowlkes
M. Polito
Rahul Bhotika
Stefano Soatto
16
6
0
16 Jul 2021
Learning a Universal Template for Few-shot Dataset Generalization
Learning a Universal Template for Few-shot Dataset Generalization
Eleni Triantafillou
Hugo Larochelle
R. Zemel
Vincent Dumoulin
27
92
0
14 May 2021
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with
  Adapters
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters
Yan Xu
Etsuko Ishii
Samuel Cahyawijaya
Zihan Liu
Genta Indra Winata
Andrea Madotto
Dan Su
Pascale Fung
RALM
25
44
0
13 May 2021
ImageNet-21K Pretraining for the Masses
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
173
686
0
22 Apr 2021
What to Pre-Train on? Efficient Intermediate Task Selection
What to Pre-Train on? Efficient Intermediate Task Selection
Clifton A. Poth
Jonas Pfeiffer
Andreas Rucklé
Iryna Gurevych
8
94
0
16 Apr 2021
Factors of Influence for Transfer Learning across Diverse Appearance
  Domains and Task Types
Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types
Thomas Mensink
J. Uijlings
Alina Kuznetsova
Michael Gygli
V. Ferrari
VLM
26
80
0
24 Mar 2021
Self-Supervised Pretraining Improves Self-Supervised Pretraining
Self-Supervised Pretraining Improves Self-Supervised Pretraining
Colorado Reed
Xiangyu Yue
Aniruddha Nrusimha
Sayna Ebrahimi
Vivek Vijaykumar
...
Shanghang Zhang
Devin Guillory
Sean L. Metzger
Kurt Keutzer
Trevor Darrell
25
105
0
23 Mar 2021
Sequential Random Network for Fine-grained Image Classification
Chaorong Li
Malu Zhang
Wei Huang
Feng-qing Qin
Anping Zeng
Yuanyuan Huang
14
0
0
12 Mar 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
11
2,070
0
11 Jan 2021
Ranking Neural Checkpoints
Ranking Neural Checkpoints
Yandong Li
Xuhui Jia
Ruoxin Sang
Yukun Zhu
Bradley Green
Liqiang Wang
Boqing Gong
FedML
ELM
UQCV
27
47
0
23 Nov 2020
Deep Ensembles for Low-Data Transfer Learning
Deep Ensembles for Low-Data Transfer Learning
Basil Mustafa
C. Riquelme
J. Puigcerver
andAndré Susano Pinto
Daniel Keysers
N. Houlsby
FedML
OOD
20
22
0
14 Oct 2020
Which Model to Transfer? Finding the Needle in the Growing Haystack
Which Model to Transfer? Finding the Needle in the Growing Haystack
Cédric Renggli
André Susano Pinto
Luka Rimanic
J. Puigcerver
C. Riquelme
Ce Zhang
Mario Lucic
21
23
0
13 Oct 2020
Non-asymptotic oracle inequalities for the Lasso in high-dimensional
  mixture of experts
Non-asymptotic oracle inequalities for the Lasso in high-dimensional mixture of experts
TrungTin Nguyen
Hien Nguyen
Faicel Chamroukhi
Geoffrey J. McLachlan
21
1
0
22 Sep 2020
SelfAugment: Automatic Augmentation Policies for Self-Supervised
  Learning
SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning
Colorado Reed
Sean L. Metzger
A. Srinivas
Trevor Darrell
Kurt Keutzer
SSL
22
49
0
16 Sep 2020
Selecting Relevant Features from a Multi-domain Representation for
  Few-shot Classification
Selecting Relevant Features from a Multi-domain Representation for Few-shot Classification
Nikita Dvornik
Cordelia Schmid
Julien Mairal
VLM
170
24
0
20 Mar 2020
Towards Crowdsourced Training of Large Neural Networks using
  Decentralized Mixture-of-Experts
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
Max Ryabinin
Anton I. Gusev
FedML
6
48
0
10 Feb 2020
12
Next