Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.15901
Cited By
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
28 August 2024
Nikolas Gritsch
Qizhen Zhang
Acyr F. Locatelli
Sara Hooker
A. Ustun
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts"
3 / 3 papers shown
Title
Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer
A. Ustun
Arianna Bisazza
G. Bouma
Gertjan van Noord
Sebastian Ruder
44
32
0
24 May 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
145
323
0
18 Feb 2022
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1