Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.07431
Cited By
Towards More Effective and Economic Sparsely-Activated Model
14 October 2021
Hao Jiang
Ke Zhan
Jianwei Qu
Yongkang Wu
Zhaoye Fei
Xinyu Zhang
Lei Chen
Zhicheng Dou
Xipeng Qiu
Zi-Han Guo
Ruofei Lai
Jiawen Wu
Enrui Hu
Yinxia Zhang
Yantao Jia
Fan Yu
Zhao Cao
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards More Effective and Economic Sparsely-Activated Model"
3 / 3 papers shown
Title
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
404
2,576
0
03 Sep 2019
1