Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.02182
Cited By
Regularizing and Optimizing LSTM Language Models
7 August 2017
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing and Optimizing LSTM Language Models"
50 / 508 papers shown
Title
Improved Language Modeling by Decoding the Past
Siddhartha Brahma
BDL
AI4TS
4
6
0
14 Aug 2018
REGMAPR - Text Matching Made Easy
Siddhartha Brahma
VLM
14
1
0
13 Aug 2018
Confidence penalty, annealing Gaussian noise and zoneout for biLSTM-CRF networks for named entity recognition
Antonio Jimeno Yepes
16
2
0
13 Aug 2018
Character-Level Language Modeling with Deeper Self-Attention
Rami Al-Rfou
Dokook Choe
Noah Constant
Mandy Guo
Llion Jones
20
386
0
09 Aug 2018
On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition
Hao Tang
James R. Glass
8
19
0
09 Jul 2018
DARTS: Differentiable Architecture Search
Hanxiao Liu
Karen Simonyan
Yiming Yang
6
4,297
0
24 Jun 2018
Insights on representational similarity in neural networks with canonical correlation
Ari S. Morcos
M. Raghu
Samy Bengio
DRL
18
429
0
14 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
23
30
0
11 Jun 2018
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
Yikang Shen
Zhouhan Lin
Athul Paul Jacob
Alessandro Sordoni
Aaron Courville
Yoshua Bengio
17
91
0
11 Jun 2018
Towards Binary-Valued Gates for Robust LSTM Training
Zhuohan Li
Di He
Fei Tian
Wei-neng Chen
Tao Qin
Liwei Wang
Tie-Yan Liu
MQ
10
47
0
08 Jun 2018
Efficient Full-Matrix Adaptive Regularization
Naman Agarwal
Brian Bullins
Xinyi Chen
Elad Hazan
Karan Singh
Cyril Zhang
Yi Zhang
8
21
0
08 Jun 2018
GamePad: A Learning Environment for Theorem Proving
Daniel Huang
Prafulla Dhariwal
D. Song
Ilya Sutskever
18
109
0
02 Jun 2018
Incremental Natural Language Processing: Challenges, Strategies, and Evaluation
Arne Köhn
CLL
14
11
0
31 May 2018
Sigsoftmax: Reanalysis of the Softmax Bottleneck
Sekitoshi Kanai
Yasuhiro Fujiwara
Yuki Yamanaka
S. Adachi
9
68
0
28 May 2018
Stable Recurrent Models
John Miller
Moritz Hardt
11
116
0
25 May 2018
A Double-Deep Spatio-Angular Learning Framework for Light Field based Face Recognition
Alireza Sepas-Moghaddam
M. A. Haque
P. Correia
Kamal Nasrollahi
T. Moeslund
F. Pereira
CVBM
6
35
0
25 May 2018
Pushing the bounds of dropout
Gábor Melis
Charles Blundell
Tomás Kociský
Karl Moritz Hermann
Chris Dyer
Phil Blunsom
8
13
0
23 May 2018
Breaking the Activation Function Bottleneck through Adaptive Parameterization
Sebastian Flennerhag
Hujun Yin
J. Keane
Mark Elliot
14
12
0
22 May 2018
Improved Sentence Modeling using Suffix Bidirectional LSTM
Siddhartha Brahma
16
24
0
18 May 2018
Learning to Write with Cooperative Discriminators
Ari Holtzman
Jan Buys
Maxwell Forbes
Antoine Bosselut
David Golub
Yejin Choi
12
233
0
16 May 2018
Continuous Learning in a Hierarchical Multiscale Neural Network
Thomas Wolf
Julien Chaumond
Clement Delangue
CLL
AI4CE
NoLa
BDL
11
6
0
15 May 2018
Building Language Models for Text with Named Entities
Md. Rizwan Parvez
Saikat Chakraborty
Baishakhi Ray
Kai-Wei Chang
10
41
0
13 May 2018
Born Again Neural Networks
Tommaso Furlanello
Zachary Chase Lipton
Michael Tschannen
Laurent Itti
Anima Anandkumar
30
1,020
0
12 May 2018
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
Urvashi Khandelwal
He He
Peng Qi
Dan Jurafsky
RALM
9
293
0
12 May 2018
State Gradients for RNN Memory Analysis
Lyan Verwimp
Hugo Van hamme
Vincent Renkens
P. Wambacq
6
6
0
11 May 2018
Noisin: Unbiased Regularization for Recurrent Neural Networks
Adji Bousso Dieng
Rajesh Ranganath
Jaan Altosaar
David M. Blei
17
22
0
03 May 2018
Assessing Language Models with Scaling Properties
Shuntaro Takahashi
Kumiko Tanaka-Ishii
ELM
LRM
14
2
0
24 Apr 2018
Dropping Networks for Transfer Learning
J. Ó. Neill
Danushka Bollegala
9
1
0
23 Apr 2018
Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model
Sabrina J. Mielke
Jason Eisner
LRM
BDL
8
33
0
23 Apr 2018
Training DNNs with Hybrid Block Floating Point
M. Drumond
Tao R. Lin
Martin Jaggi
Babak Falsafi
17
94
0
04 Apr 2018
Aggregated Momentum: Stability Through Passive Damping
James Lucas
Shengyang Sun
R. Zemel
Roger C. Grosse
16
67
0
01 Apr 2018
Meta-Learning a Dynamical Language Model
Thomas Wolf
Julien Chaumond
Clement Delangue
16
4
0
28 Mar 2018
An Analysis of Neural Language Modeling at Multiple Scales
Stephen Merity
N. Keskar
R. Socher
19
170
0
22 Mar 2018
Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches
Yeming Wen
Paul Vicol
Jimmy Ba
Dustin Tran
Roger C. Grosse
BDL
9
307
0
12 Mar 2018
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
DRL
13
4,708
0
04 Mar 2018
Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning
Yichi Zhang
Zhijian Ou
17
0
0
01 Mar 2018
Memory-based Parameter Adaptation
Pablo Sprechmann
Siddhant M. Jayakumar
Jack W. Rae
Alexander Pritzel
Adria Puigdomenech Badia
Benigno Uria
Oriol Vinyals
Demis Hassabis
Razvan Pascanu
Charles Blundell
ODL
OOD
VLM
6
101
0
28 Feb 2018
Reusing Weights in Subword-aware Neural Language Models
Z. Assylbekov
Rustem Takhanov
18
4
0
23 Feb 2018
The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks
Nicholas Carlini
Chang-rui Liu
Ulfar Erlingsson
Jernej Kos
D. Song
45
1,111
0
22 Feb 2018
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
14
11,483
0
15 Feb 2018
Neural Voice Cloning with a Few Samples
Sercan Ö. Arik
Jitong Chen
Kainan Peng
Wei Ping
Yanqi Zhou
11
380
0
14 Feb 2018
Efficient Neural Architecture Search via Parameter Sharing
Hieu H. Pham
M. Guan
Barret Zoph
Quoc V. Le
J. Dean
19
2,745
0
09 Feb 2018
Universal Language Model Fine-tuning for Text Classification
Jeremy Howard
Sebastian Ruder
VLM
19
274
0
18 Jan 2018
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
27
101
0
14 Jan 2018
Character-level Recurrent Neural Networks in Practice: Comparing Training and Sampling Schemes
Cedric De Boom
Thomas Demeester
Bart Dhoedt
8
8
0
02 Jan 2018
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
19
520
0
20 Dec 2017
A Flexible Approach to Automated RNN Architecture Generation
Martin Schrimpf
Stephen Merity
James Bradbury
R. Socher
19
15
0
20 Dec 2017
Characterizing the hyper-parameter space of LSTM language models for mixed context applications
Victor Akinwande
S. Remy
19
1
0
08 Dec 2017
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
Zhilin Yang
Zihang Dai
Ruslan Salakhutdinov
William W. Cohen
BDL
16
364
0
10 Nov 2017
Weighted Transformer Network for Machine Translation
Karim Ahmed
N. Keskar
R. Socher
25
133
0
06 Nov 2017
Previous
1
2
3
...
10
11
9
Next