Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1701.06538
Cited By
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
23 January 2017
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer"
45 / 495 papers shown
Title
InstaNAS: Instance-aware Neural Architecture Search
A. Cheng
Chieh Hubert Lin
Da-Cheng Juan
Wei Wei
Min Sun
22
46
0
26 Nov 2018
SpotTune: Transfer Learning through Adaptive Fine-tuning
Yunhui Guo
Humphrey Shi
Abhishek Kumar
Kristen Grauman
Tajana Simunic
Rogerio Feris
36
445
0
21 Nov 2018
Federated Learning for Mobile Keyboard Prediction
Andrew Straiton Hard
Kanishka Rao
Zhifeng Lin
Swaroop Indra Ramaswamy
Youjie Li
S. Augenstein
A. Schwing
M. Annavaram
A. Avestimehr
FedML
9
1,510
0
08 Nov 2018
Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference
Matthew D Riemer
Ignacio Cases
R. Ajemian
Miao Liu
Irina Rish
Y. Tu
Gerald Tesauro
CLL
27
765
0
29 Oct 2018
Mode Normalization
Lucas Deecke
Iain Murray
Hakan Bilen
OOD
29
33
0
12 Oct 2018
Dynamic Channel Pruning: Feature Boosting and Suppression
Xitong Gao
Yiren Zhao
L. Dudziak
Robert D. Mullins
Chengzhong Xu
30
311
0
12 Oct 2018
Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Xilun Chen
Ahmed Hassan Awadallah
Hany Hassan
Wei Wang
Claire Cardie
36
20
0
08 Oct 2018
A Span Selection Model for Semantic Role Labeling
Hiroki Ouchi
Hiroyuki Shindo
Yuji Matsumoto
24
96
0
04 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
21
387
0
28 Sep 2018
Recurrent World Models Facilitate Policy Evolution
David R Ha
Jürgen Schmidhuber
SyDa
TPM
22
914
0
04 Sep 2018
Improving the Expressiveness of Deep Learning Frameworks with Recursion
Eunji Jeong
Joo Seong Jeong
Soojeong Kim
Gyeong-In Yu
Byung-Gon Chun
23
18
0
04 Sep 2018
Direct Output Connection for a High-Rank Language Model
Sho Takase
Jun Suzuki
Masaaki Nagata
18
36
0
30 Aug 2018
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang
Chien-chin Huang
Jinyang Li
35
154
0
24 Jul 2018
A Modulation Module for Multi-task Learning with Applications in Image Retrieval
Xiangyu Zhao
Haoxiang Li
Xiaohui Shen
Xiaodan Liang
Ying Nian Wu
21
137
0
17 Jul 2018
Multi-variable LSTM neural network for autoregressive exogenous model
Tian Guo
Tao R. Lin
BDL
AI4TS
30
19
0
17 Jun 2018
Channel Gating Neural Networks
Weizhe Hua
Yuan Zhou
Christopher De Sa
Zhiru Zhang
G. E. Suh
15
180
0
29 May 2018
Dynamic Control Flow in Large-Scale Machine Learning
Yuan Yu
Martín Abadi
P. Barham
E. Brevdo
M. Burrows
...
Michael Isard
M. Kudlur
R. Monga
D. Murray
Xiaoqiang Zheng
AI4CE
22
106
0
04 May 2018
Network Transplanting
Quanshi Zhang
Yu Yang
Ying Nian Wu
Song-Chun Zhu
OOD
11
5
0
26 Apr 2018
VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry
Noha Radwan
Abhinav Valada
Wolfram Burgard
76
240
0
23 Apr 2018
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Liyuan Liu
Xiang Ren
Jingbo Shang
Jian-wei Peng
Jiawei Han
19
44
0
20 Apr 2018
An Analysis of Neural Language Modeling at Multiple Scales
Stephen Merity
N. Keskar
R. Socher
22
170
0
22 Mar 2018
Tensor2Tensor for Neural Machine Translation
Ashish Vaswani
Samy Bengio
E. Brevdo
François Chollet
Aidan Gomez
...
Nal Kalchbrenner
Niki Parmar
Ryan Sepassi
Noam M. Shazeer
Jakob Uszkoreit
40
526
0
16 Mar 2018
Universal Neural Machine Translation for Extremely Low Resource Languages
Jiatao Gu
Hany Hassan
Jacob Devlin
V. Li
23
273
0
15 Feb 2018
Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care
Patrick Schwab
E. Keller
C. Muroi
David J. Mack
C. Strässle
W. Karlen
21
23
0
14 Feb 2018
Intriguing Properties of Randomly Weighted Networks: Generalizing While Learning Next to Nothing
Amir Rosenfeld
John K. Tsotsos
MLT
29
51
0
02 Feb 2018
Rank of Experts: Detection Network Ensemble
Seung-Hwan Bae
Youngwan Lee
Y. Jo
Yuseok Bae
Joong-won Hwang
ObjD
26
5
0
01 Dec 2017
Convolutional Networks with Adaptive Inference Graphs
Andreas Veit
Serge J. Belongie
OOD
GNN
33
382
0
30 Nov 2017
A Correspondence Between Random Neural Networks and Statistical Field Theory
S. Schoenholz
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
17
20
0
18 Oct 2017
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
86
2,151
0
22 Sep 2017
Neural Optimizer Search with Reinforcement Learning
Irwan Bello
Barret Zoph
Vijay Vasudevan
Quoc V. Le
ODL
29
383
0
21 Sep 2017
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Victor Campos
Brendan Jou
Xavier Giró-i-Nieto
Jordi Torres
Shih-Fu Chang
16
217
0
22 Aug 2017
Deep Mixture of Diverse Experts for Large-Scale Visual Recognition
Tianyi Zhao
Jun-chen Yu
Zhenzhong Kuang
Wei Zhang
Jianping Fan
MoE
29
13
0
24 Jun 2017
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting
Xu Sun
Xuancheng Ren
Shuming Ma
Houfeng Wang
11
155
0
19 Jun 2017
One Model To Learn Them All
Lukasz Kaiser
Aidan Gomez
Noam M. Shazeer
Ashish Vaswani
Niki Parmar
Llion Jones
Jakob Uszkoreit
VLM
ViT
17
333
0
16 Jun 2017
Large-Scale YouTube-8M Video Understanding with Deep Neural Networks
Manuk Akopyan
Eshsou Khashba
17
7
0
14 Jun 2017
Depthwise Separable Convolutions for Neural Machine Translation
Lukasz Kaiser
Aidan Gomez
François Chollet
33
278
0
09 Jun 2017
Deep Convolutional Decision Jungle for Image Classification
Seungryul Baek
K. Kim
Tae-Kyun Kim
19
17
0
06 Jun 2017
On-the-fly Operation Batching in Dynamic Computation Graphs
Graham Neubig
Yoav Goldberg
Chris Dyer
34
60
0
22 May 2017
Convolutional Sequence to Sequence Learning
Jonas Gehring
Michael Auli
David Grangier
Denis Yarats
Yann N. Dauphin
AIMat
17
3,263
0
08 May 2017
Hard Mixtures of Experts for Large Scale Weakly Supervised Vision
Sam Gross
MarcÁurelio Ranzato
Arthur Szlam
MoE
14
101
0
20 Apr 2017
Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution
Lanlan Liu
Jia Deng
19
200
0
02 Jan 2017
Capacity and Trainability in Recurrent Neural Networks
Jasmine Collins
Jascha Narain Sohl-Dickstein
David Sussillo
26
203
0
29 Nov 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
Automatic differentiation in machine learning: a survey
A. G. Baydin
Barak A. Pearlmutter
Alexey Radul
J. Siskind
PINN
AI4CE
ODL
54
2,746
0
20 Feb 2015
Previous
1
2
3
...
10
8
9