Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.02182
Cited By
Regularizing and Optimizing LSTM Language Models
7 August 2017
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing and Optimizing LSTM Language Models"
50 / 508 papers shown
Title
Alternating Synthetic and Real Gradients for Neural Language Modeling
Fangxin Shang
Hao Zhang
16
1
0
27 Feb 2019
Evaluating the Search Phase of Neural Architecture Search
Kaicheng Yu
C. Sciuto
Martin Jaggi
C. Musat
Mathieu Salzmann
9
342
0
21 Feb 2019
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities
O. Ganea
Sylvain Gelly
Gary Bécigneul
Aliaksei Severyn
21
18
0
21 Feb 2019
Random Search and Reproducibility for Neural Architecture Search
Liam Li
Ameet Talwalkar
OOD
24
716
0
20 Feb 2019
A Simple Baseline for Bayesian Uncertainty in Deep Learning
Wesley J. Maddox
T. Garipov
Pavel Izmailov
Dmitry Vetrov
A. Wilson
BDL
UQCV
11
793
0
07 Feb 2019
Compression of Recurrent Neural Networks for Efficient Language Modeling
Artem M. Grachev
D. Ignatov
Andrey V. Savchenko
11
39
0
06 Feb 2019
Augment your batch: better training with larger batches
Elad Hoffer
Tal Ben-Nun
Itay Hubara
Niv Giladi
Torsten Hoefler
Daniel Soudry
ODL
22
72
0
27 Jan 2019
Variational Smoothing in Recurrent Neural Network Language Models
Lingpeng Kong
Gábor Melis
Wang Ling
Lei Yu
Dani Yogatama
13
3
0
27 Jan 2019
State-Regularized Recurrent Neural Networks
Cheng Wang
Mathias Niepert
18
39
0
25 Jan 2019
Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies
A. Chandar
Chinnadhurai Sankar
Eugene Vorontsov
Samira Ebrahimi Kahou
Yoshua Bengio
21
56
0
22 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
10
3,671
0
09 Jan 2019
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network
Aditya Kusupati
Manish Singh
Kush S. Bhatia
A. Kumar
Prateek Jain
Manik Varma
16
189
0
08 Jan 2019
Learning a Generator Model from Terminal Bus Data
N. Stulov
D. Sobajic
Yury Maximov
Deepjyoti Deka
Michael Chertkov
14
4
0
03 Jan 2019
A Tutorial on Deep Latent Variable Models of Natural Language
Yoon Kim
Sam Wiseman
Alexander M. Rush
BDL
VLM
17
42
0
17 Dec 2018
Deep Anomaly Detection with Outlier Exposure
Dan Hendrycks
Mantas Mazeika
Thomas G. Dietterich
OODD
16
1,450
0
11 Dec 2018
Inflo: News Categorization and Keyphrase Extraction for Implementation in an Aggregation System
Pranav A
Nick Sukiennik
Pan Hui
27
2
0
10 Dec 2018
ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network
Sachin Mehta
Mohammad Rastegari
Linda G. Shapiro
Hannaneh Hajishirzi
VLM
17
392
0
28 Nov 2018
Plan-And-Write: Towards Better Automatic Storytelling
Lili Yao
Nanyun Peng
R. Weischedel
Kevin Knight
Dongyan Zhao
Rui Yan
6
402
0
14 Nov 2018
Modeling Local Dependence in Natural Language with Multi-channel Recurrent Neural Networks
Chang Xu
Weiran Huang
Hongwei Wang
G. Wang
Tie-Yan Liu
6
13
0
13 Nov 2018
Fine-tuning of Language Models with Discriminator
Vadim Popov
Mikhail Kudinov
11
2
0
12 Nov 2018
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning
Yoonchang Sung
Jiawei Wu
Da Zhang
Yu-Chuan Su
Pratap Tokekar
24
39
0
07 Nov 2018
Analysing Dropout and Compounding Errors in Neural Language Models
James OÑeill
Danushka Bollegala
20
1
0
02 Nov 2018
Progress and Tradeoffs in Neural Language Models
Raphael Tang
Jimmy J. Lin
8
5
0
02 Nov 2018
You May Not Need Attention
Ofir Press
Noah A. Smith
14
27
0
31 Oct 2018
Language Modeling with Sparse Product of Sememe Experts
Yihong Gu
Jun Yan
Hao Zhu
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Fen Lin
Leyu Lin
MoE
13
31
0
29 Oct 2018
Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training
Hila Gonen
Yoav Goldberg
6
31
0
28 Oct 2018
Reversible Recurrent Neural Networks
M. Mackay
Paul Vicol
Jimmy Ba
Roger C. Grosse
6
52
0
25 Oct 2018
Universal Language Model Fine-Tuning with Subword Tokenization for Polish
Piotr Czapla
Jeremy Howard
Marcin Kardas
8
7
0
24 Oct 2018
Language Modeling at Scale
Md. Mostofa Ali Patwary
Milind Chabbi
Heewoo Jun
Jiaji Huang
G. Diamos
Kenneth Ward Church
ALM
12
5
0
23 Oct 2018
Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks
Yikang Shen
Shawn Tan
Alessandro Sordoni
Aaron Courville
26
322
0
22 Oct 2018
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
Payel Das
Kahini Wadhawan
Oscar Chang
Tom Sercu
Cicero Nogueira dos Santos
Matthew D Riemer
Vijil Chenthamarakshan
Inkit Padhi
Aleksandra Mojsilović
DRL
15
0
0
17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Xiaodong Cui
Wei Zhang
Zoltán Tüske
M. Picheny
ODL
11
89
0
16 Oct 2018
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
15
145
0
15 Oct 2018
A System for Massively Parallel Hyperparameter Tuning
Liam Li
Kevin G. Jamieson
Afshin Rostamizadeh
Ekaterina Gonina
Moritz Hardt
Benjamin Recht
Ameet Talwalkar
13
370
0
13 Oct 2018
Dropout as a Structured Shrinkage Prior
Eric T. Nalisnick
José Miguel Hernández-Lobato
Padhraic Smyth
BDL
UQCV
4
1
0
09 Oct 2018
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Abhijit Mahalunkar
John D. Kelleher
19
8
0
06 Oct 2018
Adaptive Pruning of Neural Language Models for Mobile Devices
Raphael Tang
Jimmy J. Lin
16
6
0
27 Sep 2018
Information-Weighted Neural Cache Language Models for ASR
Lyan Verwimp
J. Pelemans
Hugo Van hamme
P. Wambacq
KELM
RALM
9
2
0
24 Sep 2018
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu
Xiaodong Liu
Yelong Shen
Jingjing Liu
Jianfeng Gao
19
51
0
18 Sep 2018
FRAGE: Frequency-Agnostic Word Representation
Chengyue Gong
Di He
Xu Tan
Tao Qin
Liwei Wang
Tie-Yan Liu
OOD
18
144
0
18 Sep 2018
Towards JointUD: Part-of-speech Tagging and Lemmatization using Recurrent Neural Networks
G. Arakelyan
Karen Hambardzumyan
Hrant Khachatrian
17
9
0
10 Sep 2018
MTNT: A Testbed for Machine Translation of Noisy Text
Paul Michel
Graham Neubig
11
145
0
02 Sep 2018
Direct Output Connection for a High-Rank Language Model
Sho Takase
Jun Suzuki
Masaaki Nagata
18
36
0
30 Aug 2018
Pyramidal Recurrent Unit for Language Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
19
10
0
27 Aug 2018
Dissecting Contextual Word Embeddings: Architecture and Representation
Matthew E. Peters
Mark Neumann
Luke Zettlemoyer
Wen-tau Yih
13
425
0
27 Aug 2018
Predefined Sparseness in Recurrent Sequence Models
T. Demeester
Johannes Deleu
Fréderic Godin
Chris Develder
11
3
0
27 Aug 2018
Financial Aspect-Based Sentiment Analysis using Deep Representations
Steven Yang
Jason Rosenfeld
Jacques Makutonin
13
13
0
23 Aug 2018
Improving Abstraction in Text Summarization
Wojciech Kry'sciñski
Romain Paulus
Caiming Xiong
R. Socher
11
147
0
23 Aug 2018
Neural Architecture Optimization
Renqian Luo
Fei Tian
Tao Qin
Enhong Chen
Tie-Yan Liu
3DV
26
648
0
22 Aug 2018
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
36
429
0
22 Aug 2018
Previous
1
2
3
...
10
11
8
9
Next