ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.02182
  4. Cited By
Regularizing and Optimizing LSTM Language Models

Regularizing and Optimizing LSTM Language Models

7 August 2017
Stephen Merity
N. Keskar
R. Socher
ArXivPDFHTML

Papers citing "Regularizing and Optimizing LSTM Language Models"

50 / 509 papers shown
Title
Attention-based Modeling for Emotion Detection and Classification in
  Textual Conversations
Attention-based Modeling for Emotion Detection and Classification in Textual Conversations
Waleed Ragheb
J. Azé
S. Bringay
Maximilien Servajean
11
25
0
14 Jun 2019
Character n-gram Embeddings to Improve RNN Language Models
Character n-gram Embeddings to Improve RNN Language Models
Sho Takase
Jun Suzuki
Masaaki Nagata
22
25
0
13 Jun 2019
Calibration, Entropy Rates, and Memory in Language Models
Calibration, Entropy Rates, and Memory in Language Models
M. Braverman
Xinyi Chen
Sham Kakade
Karthik Narasimhan
Cyril Zhang
Yi Zhang
11
38
0
11 Jun 2019
Improving Neural Language Modeling via Adversarial Training
Improving Neural Language Modeling via Adversarial Training
Dilin Wang
Chengyue Gong
Qiang Liu
AAML
35
115
0
10 Jun 2019
Recurrent Kernel Networks
Recurrent Kernel Networks
Dexiong Chen
Laurent Jacob
Julien Mairal
15
13
0
07 Jun 2019
One-Shot Neural Architecture Search via Compressive Sensing
One-Shot Neural Architecture Search via Compressive Sensing
Minsu Cho
Mohammadreza Soltani
C. Hegde
14
17
0
07 Jun 2019
Automated Speech Generation from UN General Assembly Statements: Mapping
  Risks in AI Generated Texts
Automated Speech Generation from UN General Assembly Statements: Mapping Risks in AI Generated Texts
Joseph Aylett-Bullock
M. Luengo-Oroz
14
15
0
05 Jun 2019
Improving Neural Language Models by Segmenting, Attending, and
  Predicting the Future
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future
Hongyin Luo
Lan Jiang
Yonatan Belinkov
James R. Glass
8
13
0
04 Jun 2019
Modular Universal Reparameterization: Deep Multi-task Learning Across
  Diverse Domains
Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains
Elliot Meyerson
Risto Miikkulainen
OOD
14
30
0
31 May 2019
Improved memory in recurrent neural networks with sequential non-normal
  dynamics
Improved memory in recurrent neural networks with sequential non-normal dynamics
A. Orhan
Xaq Pitkow
8
13
0
31 May 2019
On Network Design Spaces for Visual Recognition
On Network Design Spaces for Visual Recognition
Ilija Radosavovic
Justin Johnson
Saining Xie
Wan-Yen Lo
Piotr Dollár
17
134
0
30 May 2019
Efficient Neural Architecture Search via Proximal Iterations
Efficient Neural Architecture Search via Proximal Iterations
Quanming Yao
Ju Xu
Wei-Wei Tu
Zhanxing Zhu
22
104
0
30 May 2019
Regularization Advantages of Multilingual Neural Language Models for Low
  Resource Domains
Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains
Navid Rekabsaz
Nikolaos Pappas
James Henderson
B. K. Khonglah
S. Madikeri
28
1
0
29 May 2019
Rethinking Full Connectivity in Recurrent Neural Networks
Rethinking Full Connectivity in Recurrent Neural Networks
Matthijs Van Keirsbilck
A. Keller
Xiaodong Yang
LRM
14
13
0
29 May 2019
Instant Quantization of Neural Networks using Monte Carlo Methods
Instant Quantization of Neural Networks using Monte Carlo Methods
Gonçalo Mordido
Matthijs Van Keirsbilck
A. Keller
MQ
19
9
0
29 May 2019
Better Long-Range Dependency By Bootstrapping A Mutual Information
  Regularizer
Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer
Yanshuai Cao
Peng-Tao Xu
9
2
0
28 May 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
J. Zhang
Tianxing He
S. Sra
Ali Jadbabaie
14
441
0
28 May 2019
Learning distant cause and effect using only local and immediate credit
  assignment
Learning distant cause and effect using only local and immediate credit assignment
D. Rawlinson
Abdelrahman Ahmed
Gideon Kowadlo
10
3
0
28 May 2019
Blockwise Adaptivity: Faster Training and Better Generalization in Deep
  Learning
Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning
Shuai Zheng
James T. Kwok
ODL
8
5
0
23 May 2019
Quantifying Long Range Dependence in Language and User Behavior to
  improve RNNs
Quantifying Long Range Dependence in Language and User Behavior to improve RNNs
Francois Belletti
Minmin Chen
Ed H. Chi
AI4TS
8
21
0
23 May 2019
Adaptive norms for deep learning with regularized Newton methods
Adaptive norms for deep learning with regularized Newton methods
Jonas Köhler
Leonard Adolphs
Aurélien Lucchi
ODL
9
11
0
22 May 2019
Predicting TED Talk Ratings from Language and Prosody
Predicting TED Talk Ratings from Language and Prosody
Md. Iftekhar Tanveer
Md. Kamrul Hassan
D. Gildea
Ehsan Hoque
AI4TS
22
2
0
21 May 2019
A Causality-Guided Prediction of the TED Talk Ratings from the
  Speech-Transcripts using Neural Networks
A Causality-Guided Prediction of the TED Talk Ratings from the Speech-Transcripts using Neural Networks
Md. Iftekhar Tanveer
M. Hasan
D. Gildea
Ehsan Hoque
AI4TS
CML
9
5
0
21 May 2019
ERNIE: Enhanced Language Representation with Informative Entities
ERNIE: Enhanced Language Representation with Informative Entities
Zhengyan Zhang
Xu Han
Zhiyuan Liu
Xin Jiang
Maosong Sun
Qun Liu
6
1,383
0
17 May 2019
Efficient Optimization of Loops and Limits with Randomized Telescoping
  Sums
Efficient Optimization of Loops and Limits with Randomized Telescoping Sums
Alex Beatson
Ryan P. Adams
9
21
0
16 May 2019
Deep Residual Output Layers for Neural Language Generation
Deep Residual Output Layers for Neural Language Generation
Nikolaos Pappas
James Henderson
16
7
0
14 May 2019
Long Short-Term Memory with Gate and State Level Fusion for Light
  Field-Based Face Recognition
Long Short-Term Memory with Gate and State Level Fusion for Light Field-Based Face Recognition
Alireza Sepas-Moghaddam
Ali Etemad
F. Pereira
P. Correia
CVBM
24
1
0
11 May 2019
Mutual Information Scaling and Expressive Power of Sequence Models
Mutual Information Scaling and Expressive Power of Sequence Models
Huitao Shen
15
18
0
10 May 2019
When Deep Learning Met Code Search
When Deep Learning Met Code Search
J. Cambronero
Hongyu Li
Seohyun Kim
Koushik Sen
S. Chandra
CLIP
16
218
0
09 May 2019
Differentiable Architecture Search with Ensemble Gumbel-Softmax
Differentiable Architecture Search with Ensemble Gumbel-Softmax
Jianlong Chang
Xinbang Zhang
Yiwen Guo
Gaofeng Meng
Shiming Xiang
Chunhong Pan
3DPC
30
18
0
06 May 2019
English Broadcast News Speech Recognition by Humans and Machines
English Broadcast News Speech Recognition by Humans and Machines
Samuel Thomas
Masayuki Suzuki
Yinghui Huang
Gakuto Kurata
Zoltán Tüske
...
Brian Kingsbury
M. Picheny
Tom Dibert
Alice Kaiser-Schatzlein
Bern Samko
6
14
0
30 Apr 2019
Think Again Networks and the Delta Loss
Think Again Networks and the Delta Loss
Alexandre Salle
Marcelo O. R. Prates
20
2
0
26 Apr 2019
Survey of Dropout Methods for Deep Neural Networks
Survey of Dropout Methods for Deep Neural Networks
Alex Labach
Hojjat Salehinejad
S. Valaee
21
149
0
25 Apr 2019
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural
  Speaker Separation
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Yuzhou Liu
DeLiang Wang
27
157
0
25 Apr 2019
Some Limit Properties of Markov Chains Induced by Stochastic Recursive
  Algorithms
Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms
Abhishek Gupta
Hao Chen
Jianzong Pi
Gaurav Tendolkar
17
0
0
24 Apr 2019
Adversarial Dropout for Recurrent Neural Networks
Adversarial Dropout for Recurrent Neural Networks
Sungrae Park
Kyungwoo Song
Mingi Ji
Wonsung Lee
Il-Chul Moon
6
6
0
22 Apr 2019
Language Models with Transformers
Language Models with Transformers
Chenguang Wang
Mu Li
Alex Smola
10
120
0
20 Apr 2019
Sparseout: Controlling Sparsity in Deep Networks
Sparseout: Controlling Sparsity in Deep Networks
Najeeb Khan
Ian Stavness
BDL
18
9
0
17 Apr 2019
Knowledge-Augmented Language Model and its Application to Unsupervised
  Named-Entity Recognition
Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition
Angli Liu
Jingfei Du
Veselin Stoyanov
11
38
0
09 Apr 2019
Knowledge Distillation For Recurrent Neural Network Language Modeling
  With Trust Regularization
Knowledge Distillation For Recurrent Neural Network Language Modeling With Trust Regularization
Yangyang Shi
M. Hwang
X. Lei
Haoyu Sheng
26
25
0
08 Apr 2019
WeNet: Weighted Networks for Recurrent Network Architecture Search
WeNet: Weighted Networks for Recurrent Network Architecture Search
Zhiheng Huang
Bing Xiang
6
4
0
08 Apr 2019
Unsupervised Recurrent Neural Network Grammars
Unsupervised Recurrent Neural Network Grammars
Yoon Kim
Alexander M. Rush
Lei Yu
A. Kuncoro
Chris Dyer
Gábor Melis
LRM
RALM
SSL
22
115
0
07 Apr 2019
Identifying and Reducing Gender Bias in Word-Level Language Models
Identifying and Reducing Gender Bias in Word-Level Language Models
Shikha Bordia
Samuel R. Bowman
FaML
9
323
0
05 Apr 2019
Plan, Write, and Revise: an Interactive System for Open-Domain Story
  Generation
Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation
Seraphina Goldfarb-Tarrant
Haining Feng
Nanyun Peng
11
66
0
04 Apr 2019
Modeling Vocabulary for Big Code Machine Learning
Modeling Vocabulary for Big Code Machine Learning
Hlib Babii
Andrea Janes
Romain Robbes
11
22
0
03 Apr 2019
Understanding language-elicited EEG data by predicting it from a
  fine-tuned language model
Understanding language-elicited EEG data by predicting it from a fine-tuned language model
Dan Schwartz
Tom Michael Mitchell
11
20
0
02 Apr 2019
Conversation Model Fine-Tuning for Classifying Client Utterances in
  Counseling Dialogues
Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues
Sungjoon Park
Donghyun Kim
Alice H. Oh
4
14
0
31 Mar 2019
Low Resource Text Classification with ULMFit and Backtranslation
Low Resource Text Classification with ULMFit and Backtranslation
Sam Shleifer
VLM
11
57
0
21 Mar 2019
Zeno++: Robust Fully Asynchronous SGD
Zeno++: Robust Fully Asynchronous SGD
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
FedML
11
106
0
17 Mar 2019
Partially Shuffling the Training Data to Improve Language Models
Partially Shuffling the Training Data to Improve Language Models
Ofir Press
11
6
0
11 Mar 2019
Previous
123...1011789
Next