Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.02182
Cited By
Regularizing and Optimizing LSTM Language Models
7 August 2017
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing and Optimizing LSTM Language Models"
50 / 509 papers shown
Title
A Unifying Framework of Bilinear LSTMs
Mohit Rajpal
Bryan Kian Hsiang Low
11
0
0
23 Oct 2019
Localization of Fake News Detection via Multitask Transfer Learning
Jan Christian Blaise Cruz
Julianne Agatha Tan
C. Cheng
23
33
0
21 Oct 2019
Evolution of transfer learning in natural language processing
Aditya Malte
Pratik Ratadiya
11
54
0
16 Oct 2019
Hierarchical Hidden Markov Jump Processes for Cancer Screening Modeling
Rui Meng
Soper Braden
J. Nygård
Mari Nygrad
Herbert Lee
14
2
0
13 Oct 2019
Deep Independently Recurrent Neural Network (IndRNN)
Shuai Li
Wanqing Li
Chris Cook
Yanbo Gao
21
50
0
11 Oct 2019
Searching for A Robust Neural Architecture in Four GPU Hours
Xuanyi Dong
Yezhou Yang
11
646
0
10 Oct 2019
Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods
Kevin J Liang
Guoyin Wang
Yitong Li
Ricardo Henao
Lawrence Carin
27
2
0
09 Oct 2019
AntMan: Sparse Low-Rank Compression to Accelerate RNN inference
Samyam Rajbhandari
H. Shrivastava
J. Rho
MQ
17
8
0
02 Oct 2019
Better Document-Level Machine Translation with Bayes' Rule
Lei Yu
Laurent Sartran
Wojciech Stokowiec
Wang Ling
Lingpeng Kong
Phil Blunsom
Chris Dyer
11
7
0
01 Oct 2019
Generalization in Generation: A closer look at Exposure Bias
Florian Schmidt
11
87
0
01 Oct 2019
A Constructive Prediction of the Generalization Error Across Scales
Jonathan S. Rosenfeld
Amir Rosenfeld
Yonatan Belinkov
Nir Shavit
22
205
0
27 Sep 2019
Pre-train, Interact, Fine-tune: A Novel Interaction Representation for Text Classification
Jianming Zheng
Fei Cai
Honghui Chen
Maarten de Rijke
25
21
0
26 Sep 2019
Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data
H. Shahidi
Ming Li
Jimmy J. Lin
LMTD
11
14
0
23 Sep 2019
Goal-Embedded Dual Hierarchical Model for Task-Oriented Dialogue Generation
Yi-An Lai
Arshit Gupta
Yi Zhang
13
1
0
19 Sep 2019
Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
Noémien Kocher
Christian Scuito
Lorenzo Tarantino
Alexandros Lazaridis
Andreas Fischer
C. Musat
13
0
0
18 Sep 2019
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
L. Varshney
Caiming Xiong
R. Socher
AI4CE
55
1,233
0
11 Sep 2019
Learning Dynamic Author Representations with Temporal Language Models
E. Delasalles
Sylvain Lamprier
Ludovic Denoyer
19
9
0
11 Sep 2019
Multimodal Embeddings from Language Models
Shao-Yen Tseng
P. Georgiou
Shrikanth Narayanan
32
11
0
10 Sep 2019
Story Realization: Expanding Plot Events into Sentences
Prithviraj Ammanabrolu
Ethan Tien
W. Cheung
Z. Luo
William Ma
Lara J. Martin
Mark O. Riedl
NAI
19
69
0
08 Sep 2019
PaLM: A Hybrid Parser and Language Model
Hao Peng
Roy Schwartz
Noah A. Smith
AIMat
18
15
0
04 Sep 2019
Deep Equilibrium Models
Shaojie Bai
J. Zico Kolter
V. Koltun
14
657
0
03 Sep 2019
MANAS: Multi-Agent Neural Architecture Search
Vasco Lopes
Fabio Maria Carlucci
P. Esperança
Marco Singh
Victor Gabillon
Antoine Yang
Hang Xu
Zewei Chen
Jun Wang
22
23
0
03 Sep 2019
Behavior Gated Language Models
Prashanth Gurunath Shivakumar
Shao-Yen Tseng
P. Georgiou
Shrikanth Narayanan
26
1
0
31 Aug 2019
Analyzing Customer Feedback for Product Fit Prediction
S. Baier
8
4
0
28 Aug 2019
FinBERT: Financial Sentiment Analysis with Pre-trained Language Models
Dogu Araci
AIFin
21
623
0
27 Aug 2019
On the Effectiveness of Low-Rank Matrix Factorization for LSTM Model Compression
Genta Indra Winata
Andrea Madotto
Jamin Shin
Elham J. Barezi
Pascale Fung
19
28
0
27 Aug 2019
Restricted Recurrent Neural Networks
Enmao Diao
Jie Ding
Vahid Tarokh
11
20
0
21 Aug 2019
Latent Relation Language Models
Hiroaki Hayashi
Zecong Hu
Chenyan Xiong
Graham Neubig
KELM
14
42
0
21 Aug 2019
A Neural Virtual Anchor Synthesizer based on Seq2Seq and GAN Models
Ning Wang
Zhaoxiang Liu
Zezhou Chen
Huan Hu
Shiguo Lian
CVBM
16
9
0
20 Aug 2019
PrivFT: Private and Fast Text Classification with Homomorphic Encryption
Ahmad Al Badawi
Louie Hoang
Chan Fook Mun
Kim Laine
Khin Mi Mi Aung
19
79
0
19 Aug 2019
Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring
Zihan Liu
Yan Xu
Genta Indra Winata
Pascale Fung
18
22
0
16 Aug 2019
Challenging the Boundaries of Speech Recognition: The MALACH Corpus
M. Picheny
Zoltán Tüske
Brian Kingsbury
Kartik Audhkhasi
Xiaodong Cui
G. Saon
AuLLM
11
13
0
09 Aug 2019
Classification of Hand Movements from EEG using a Deep Attention-based LSTM Network
Guangyi Zhang
Vandad Davoodnia
Alireza Sepas-Moghaddam
Yaoxue Zhang
Ali Etemad
11
130
0
06 Aug 2019
Representation Degeneration Problem in Training Natural Language Generation Models
Jun Gao
Di He
Xu Tan
Tao Qin
Liwei Wang
Tie-Yan Liu
10
263
0
28 Jul 2019
Adaptive Noise Injection: A Structure-Expanding Regularization for RNN
Rui Li
Kai Shuang
Mengyu Gu
Sen Su
17
0
0
25 Jul 2019
Decentralized Deep Learning with Arbitrary Communication Compression
Anastasia Koloskova
Tao R. Lin
Sebastian U. Stich
Martin Jaggi
FedML
17
232
0
22 Jul 2019
Efficient Novelty-Driven Neural Architecture Search
Miao Zhang
Huiqi Li
Shirui Pan
Taoping Liu
Steven W. Su
23
1
0
22 Jul 2019
Lookahead Optimizer: k steps forward, 1 step back
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
31
718
0
19 Jul 2019
Universality and individuality in neural dynamics across large populations of recurrent networks
Niru Maheswaranathan
Alex H. Williams
Matthew D. Golub
Surya Ganguli
David Sussillo
14
140
0
19 Jul 2019
Low-Shot Classification: A Comparison of Classical and Deep Transfer Machine Learning Approaches
Peter Usherwood
S. Smit
VLM
6
11
0
17 Jul 2019
Multi-Element Long Distance Dependencies: Using SPk Languages to Explore the Characteristics of Long-Distance Dependencies
Abhijit Mahalunkar
John D. Kelleher
13
11
0
13 Jul 2019
Applying a Pre-trained Language Model to Spanish Twitter Humor Prediction
Bobak Farzin
Piotr Czapla
Jeremy Howard
8
7
0
06 Jul 2019
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes
Chinnadhurai Sankar
Sujith Ravi
OffRL
21
33
0
05 Jul 2019
Kite: Automatic speech recognition for unmanned aerial vehicles
Dan Oneaţă
H. Cucu
8
13
0
02 Jul 2019
Evaluating Computational Language Models with Scaling Properties of Natural Language
Shuntaro Takahashi
Kumiko Tanaka-Ishii
8
23
0
22 Jun 2019
Generating Empathetic Responses by Looking Ahead the User's Sentiment
Jamin Shin
Peng-Tao Xu
Andrea Madotto
Pascale Fung
10
48
0
20 Jun 2019
Barack's Wife Hillary: Using Knowledge-Graphs for Fact-Aware Language Modeling
IV RobertL.Logan
Nelson F. Liu
Matthew E. Peters
Matt Gardner
Sameer Singh
RALM
17
186
0
17 Jun 2019
Exploiting Unsupervised Pre-training and Automated Feature Engineering for Low-resource Hate Speech Detection in Polish
Renard Korzeniowski
Rafal Rolczynski
Przemyslaw Sadownik
Tomasz Korbak
Marcin Mo.zejko
6
4
0
17 Jun 2019
Structured Pruning of Recurrent Neural Networks through Neuron Selection
Liangjiang Wen
Xuanyang Zhang
Haoli Bai
Zenglin Xu
6
38
0
17 Jun 2019
Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation
Wenxian Shi
Hao Zhou
Ning Miao
Lei Li
CoGe
DRL
13
8
0
16 Jun 2019
Previous
1
2
3
...
10
11
6
7
8
9
Next