Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.04309
Cited By
Efficient softmax approximation for GPUs
14 September 2016
Edouard Grave
Armand Joulin
Moustapha Cissé
David Grangier
Hervé Jégou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient softmax approximation for GPUs"
50 / 151 papers shown
Title
Scaling Up Collaborative Filtering Data Sets through Randomized Fractal Expansions
Francois Belletti
K. Lakshmanan
Walid Krichene
Nicolas Mayoraz
Yi-Fan Chen
John R. Anderson
Taylor Robie
Tayo Oguntebi
Dan Shirron
Amit Bleiwess
37
5
0
08 Apr 2019
Modeling Vocabulary for Big Code Machine Learning
Hlib Babii
Andrea Janes
Romain Robbes
19
22
0
03 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
23
3,130
0
01 Apr 2019
Cloze-driven Pretraining of Self-attention Networks
Alexei Baevski
Sergey Edunov
Yinhan Liu
Luke Zettlemoyer
Michael Auli
10
198
0
19 Mar 2019
Maybe Deep Neural Networks are the Best Choice for Modeling Source Code
Rafael-Michael Karampatsis
Charles Sutton
32
54
0
13 Mar 2019
Efficient Contextual Representation Learning Without Softmax Layer
Liunian Harold Li
Patrick H. Chen
Cho-Jui Hsieh
Kai-Wei Chang
26
6
0
28 Feb 2019
Compressing Gradient Optimizers via Count-Sketches
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
14
35
0
01 Feb 2019
Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference
Shun Liao
Ting Chen
Tian Lin
Denny Zhou
Chong-Jun Wang
MoE
7
2
0
30 Jan 2019
Pay Less Attention with Lightweight and Dynamic Convolutions
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
11
604
0
29 Jan 2019
Error-Correcting Neural Sequence Prediction
James OÑeill
Danushka Bollegala
23
1
0
21 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
38
3,679
0
09 Jan 2019
Attention-based sequence-to-sequence model for speech recognition: development of state-of-the-art system on LibriSpeech and its application to non-native English
Yan Yin
R. Prieto
Bin Wang
Jianwei Zhou
Yiwei Gu
Yang Liu
Hui-Ching Lin
7
2
0
31 Oct 2018
Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks
Patrick H. Chen
Si Si
Sanjiv Kumar
Yang Li
Cho-Jui Hsieh
16
21
0
29 Oct 2018
A no-regret generalization of hierarchical softmax to extreme multi-label classification
Marek Wydmuch
Kalina Jasinska
Mikhail Kuznetsov
R. Busa-Fekete
Krzysztof Dembczyñski
24
100
0
27 Oct 2018
Real-time Neural-based Input Method
Jiali Yao
Raphael Shu
Xinjian Li
K. Ohtsuki
Hideki Nakayama
11
4
0
19 Oct 2018
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
25
145
0
15 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
26
388
0
28 Sep 2018
Adaptive Pruning of Neural Language Models for Mobile Devices
Raphael Tang
Jimmy J. Lin
21
6
0
27 Sep 2018
Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation
X. Kong
Qizhe Xie
Zihang Dai
Eduard H. Hovy
24
2
0
25 Sep 2018
Hard Non-Monotonic Attention for Character-Level Transduction
Shijie Wu
Pamela Shapiro
Ryan Cotterell
8
42
0
29 Aug 2018
Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation
Bin Wang
Zhijian Ou
25
14
0
03 Jul 2018
Unsupervised and Efficient Vocabulary Expansion for Recurrent Neural Network Language Models in ASR
Yerbolat Khassanov
Chng Eng Siong
KELM
29
5
0
27 Jun 2018
Sigsoftmax: Reanalysis of the Softmax Bottleneck
Sekitoshi Kanai
Yasuhiro Fujiwara
Yuki Yamanaka
S. Adachi
19
68
0
28 May 2018
Learning to Write with Cooperative Discriminators
Ari Holtzman
Jan Buys
Maxwell Forbes
Antoine Bosselut
David Golub
Yejin Choi
31
234
0
16 May 2018
Adversarial Contrastive Estimation
A. Bose
Huan Ling
Yanshuai Cao
13
56
0
09 May 2018
Interpretable Adversarial Perturbation in Input Embedding Space for Text
Motoki Sato
Jun Suzuki
Hiroyuki Shindo
Yuji Matsumoto
21
188
0
08 May 2018
Online normalizer calculation for softmax
Maxim Milakov
N. Gimelshein
27
84
0
08 May 2018
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Liyuan Liu
Xiang Ren
Jingbo Shang
Jian-wei Peng
Jiawei Han
25
44
0
20 Apr 2018
Lightweight Adaptive Mixture of Neural and N-gram Language Models
A. Bakhtin
Arthur Szlam
MarcÁurelio Ranzato
Edouard Grave
20
11
0
20 Apr 2018
Fast Parametric Learning with Activation Memorization
Jack W. Rae
Chris Dyer
Peter Dayan
Timothy Lillicrap
KELM
41
46
0
27 Mar 2018
Unbiased scalable softmax optimization
Francois Fagan
G. Iyengar
6
12
0
22 Mar 2018
An Analysis of Neural Language Modeling at Multiple Scales
Stephen Merity
N. Keskar
R. Socher
24
170
0
22 Mar 2018
Augment and Reduce: Stochastic Inference for Large Categorical Distributions
Francisco J. R. Ruiz
Michalis K. Titsias
Adji Bousso Dieng
David M. Blei
BDL
19
22
0
12 Feb 2018
Accelerated Training for Massive Classification via Dynamic Class Selection
Xingcheng Zhang
Lei Yang
Junjie Yan
Dahua Lin
33
41
0
05 Jan 2018
Topic Compositional Neural Language Model
Wenlin Wang
Zhe Gan
Wenqi Wang
Dinghan Shen
Jiaji Huang
Ming-Yu Liu
S. Satheesh
Lawrence Carin
11
81
0
28 Dec 2017
Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks
Huan Zhang
Shizhen Xu
Graham Neubig
Wei-Ming Dai
Qirong Ho
Guangwen Yang
Eric Xing
GNN
28
3
0
11 Dec 2017
Adaptive Sampled Softmax with Kernel Based Sampling
Guy Blanc
Steffen Rendle
BDL
14
73
0
02 Dec 2017
Slim Embedding Layers for Recurrent Neural Language Models
Zhongliang Li
Raymond Kulhanek
Shaojun Wang
Yunxin Zhao
Shuang Wu
KELM
27
23
0
27 Nov 2017
Unbounded cache model for online language modeling with open vocabulary
Edouard Grave
Moustapha Cissé
Armand Joulin
KELM
CLL
18
62
0
07 Nov 2017
Self-organized Hierarchical Softmax
Songlin Yang
Shawn Tan
C. Pal
Aaron Courville
BDL
38
7
0
26 Jul 2017
Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones
Z. Assylbekov
Rustem Takhanov
Bagdat Myrzakhmetov
Jonathan North Washington
38
17
0
20 Jul 2017
Automatic Speech Recognition with Very Large Conversational Finnish and Estonian Vocabularies
Seppo Enarvi
Peter Smit
Sami Virpioja
M. Kurimo
23
37
0
13 Jul 2017
TAPAS: Two-pass Approximate Adaptive Sampling for Softmax
Yu Bai
S. Goldman
Li Zhang
TPM
16
15
0
10 Jul 2017
Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks
Joan Serrà
Alexandros Karatzoglou
28
52
0
13 Jun 2017
Fast Single-Class Classification and the Principle of Logit Separation
Gil Keren
Sivan Sabato
Björn Schuller
21
6
0
29 May 2017
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Mikolaj Binkowski
Gautier Marti
Philippe Donnat
AI4TS
BDL
43
149
0
12 Mar 2017
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
80
2,364
0
23 Dec 2016
Improving Neural Language Models with a Continuous Cache
Edouard Grave
Armand Joulin
Nicolas Usunier
KELM
11
300
0
13 Dec 2016
FastText.zip: Compressing text classification models
Armand Joulin
Edouard Grave
Piotr Bojanowski
Matthijs Douze
Hervé Jégou
Tomáš Mikolov
MQ
25
1,190
0
12 Dec 2016
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
Yacine Jernite
A. Choromańska
David Sontag
25
35
0
14 Oct 2016
Previous
1
2
3
4
Next