ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.02182
  4. Cited By
Regularizing and Optimizing LSTM Language Models

Regularizing and Optimizing LSTM Language Models

7 August 2017
Stephen Merity
N. Keskar
R. Socher
ArXivPDFHTML

Papers citing "Regularizing and Optimizing LSTM Language Models"

50 / 508 papers shown
Title
Alternating Synthetic and Real Gradients for Neural Language Modeling
Fangxin Shang
Hao Zhang
16
1
0
27 Feb 2019
Evaluating the Search Phase of Neural Architecture Search
Evaluating the Search Phase of Neural Architecture Search
Kaicheng Yu
C. Sciuto
Martin Jaggi
C. Musat
Mathieu Salzmann
9
342
0
21 Feb 2019
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise
  Non-linearities
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities
O. Ganea
Sylvain Gelly
Gary Bécigneul
Aliaksei Severyn
21
18
0
21 Feb 2019
Random Search and Reproducibility for Neural Architecture Search
Random Search and Reproducibility for Neural Architecture Search
Liam Li
Ameet Talwalkar
OOD
24
716
0
20 Feb 2019
A Simple Baseline for Bayesian Uncertainty in Deep Learning
A Simple Baseline for Bayesian Uncertainty in Deep Learning
Wesley J. Maddox
T. Garipov
Pavel Izmailov
Dmitry Vetrov
A. Wilson
BDL
UQCV
11
793
0
07 Feb 2019
Compression of Recurrent Neural Networks for Efficient Language Modeling
Compression of Recurrent Neural Networks for Efficient Language Modeling
Artem M. Grachev
D. Ignatov
Andrey V. Savchenko
11
39
0
06 Feb 2019
Augment your batch: better training with larger batches
Augment your batch: better training with larger batches
Elad Hoffer
Tal Ben-Nun
Itay Hubara
Niv Giladi
Torsten Hoefler
Daniel Soudry
ODL
22
72
0
27 Jan 2019
Variational Smoothing in Recurrent Neural Network Language Models
Variational Smoothing in Recurrent Neural Network Language Models
Lingpeng Kong
Gábor Melis
Wang Ling
Lei Yu
Dani Yogatama
13
3
0
27 Jan 2019
State-Regularized Recurrent Neural Networks
State-Regularized Recurrent Neural Networks
Cheng Wang
Mathias Niepert
18
39
0
25 Jan 2019
Towards Non-saturating Recurrent Units for Modelling Long-term
  Dependencies
Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies
A. Chandar
Chinnadhurai Sankar
Eugene Vorontsov
Samira Ebrahimi Kahou
Yoshua Bengio
21
56
0
22 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
10
3,671
0
09 Jan 2019
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated
  Recurrent Neural Network
FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network
Aditya Kusupati
Manish Singh
Kush S. Bhatia
A. Kumar
Prateek Jain
Manik Varma
16
189
0
08 Jan 2019
Learning a Generator Model from Terminal Bus Data
Learning a Generator Model from Terminal Bus Data
N. Stulov
D. Sobajic
Yury Maximov
Deepjyoti Deka
Michael Chertkov
14
4
0
03 Jan 2019
A Tutorial on Deep Latent Variable Models of Natural Language
A Tutorial on Deep Latent Variable Models of Natural Language
Yoon Kim
Sam Wiseman
Alexander M. Rush
BDL
VLM
17
42
0
17 Dec 2018
Deep Anomaly Detection with Outlier Exposure
Deep Anomaly Detection with Outlier Exposure
Dan Hendrycks
Mantas Mazeika
Thomas G. Dietterich
OODD
16
1,450
0
11 Dec 2018
Inflo: News Categorization and Keyphrase Extraction for Implementation
  in an Aggregation System
Inflo: News Categorization and Keyphrase Extraction for Implementation in an Aggregation System
Pranav A
Nick Sukiennik
Pan Hui
27
2
0
10 Dec 2018
ESPNetv2: A Light-weight, Power Efficient, and General Purpose
  Convolutional Neural Network
ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network
Sachin Mehta
Mohammad Rastegari
Linda G. Shapiro
Hannaneh Hajishirzi
VLM
17
392
0
28 Nov 2018
Plan-And-Write: Towards Better Automatic Storytelling
Plan-And-Write: Towards Better Automatic Storytelling
Lili Yao
Nanyun Peng
R. Weischedel
Kevin Knight
Dongyan Zhao
Rui Yan
6
402
0
14 Nov 2018
Modeling Local Dependence in Natural Language with Multi-channel
  Recurrent Neural Networks
Modeling Local Dependence in Natural Language with Multi-channel Recurrent Neural Networks
Chang Xu
Weiran Huang
Hongwei Wang
G. Wang
Tie-Yan Liu
6
13
0
13 Nov 2018
Fine-tuning of Language Models with Discriminator
Fine-tuning of Language Models with Discriminator
Vadim Popov
Mikhail Kudinov
11
2
0
12 Nov 2018
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video
  Captioning
Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning
Yoonchang Sung
Jiawei Wu
Da Zhang
Yu-Chuan Su
Pratap Tokekar
24
39
0
07 Nov 2018
Analysing Dropout and Compounding Errors in Neural Language Models
Analysing Dropout and Compounding Errors in Neural Language Models
James OÑeill
Danushka Bollegala
20
1
0
02 Nov 2018
Progress and Tradeoffs in Neural Language Models
Progress and Tradeoffs in Neural Language Models
Raphael Tang
Jimmy J. Lin
8
5
0
02 Nov 2018
You May Not Need Attention
You May Not Need Attention
Ofir Press
Noah A. Smith
14
27
0
31 Oct 2018
Language Modeling with Sparse Product of Sememe Experts
Language Modeling with Sparse Product of Sememe Experts
Yihong Gu
Jun Yan
Hao Zhu
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Fen Lin
Leyu Lin
MoE
13
31
0
29 Oct 2018
Language Modeling for Code-Switching: Evaluation, Integration of
  Monolingual Data, and Discriminative Training
Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training
Hila Gonen
Yoav Goldberg
6
31
0
28 Oct 2018
Reversible Recurrent Neural Networks
Reversible Recurrent Neural Networks
M. Mackay
Paul Vicol
Jimmy Ba
Roger C. Grosse
6
52
0
25 Oct 2018
Universal Language Model Fine-Tuning with Subword Tokenization for
  Polish
Universal Language Model Fine-Tuning with Subword Tokenization for Polish
Piotr Czapla
Jeremy Howard
Marcin Kardas
8
7
0
24 Oct 2018
Language Modeling at Scale
Language Modeling at Scale
Md. Mostofa Ali Patwary
Milind Chabbi
Heewoo Jun
Jiaji Huang
G. Diamos
Kenneth Ward Church
ALM
12
5
0
23 Oct 2018
Ordered Neurons: Integrating Tree Structures into Recurrent Neural
  Networks
Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks
Yikang Shen
Shawn Tan
Alessandro Sordoni
Aaron Courville
26
322
0
22 Oct 2018
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide
  Sequences
PepCVAE: Semi-Supervised Targeted Design of Antimicrobial Peptide Sequences
Payel Das
Kahini Wadhawan
Oscar Chang
Tom Sercu
Cicero Nogueira dos Santos
Matthew D Riemer
Vijil Chenthamarakshan
Inkit Padhi
Aleksandra Mojsilović
DRL
15
0
0
17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural
  Networks
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Xiaodong Cui
Wei Zhang
Zoltán Tüske
M. Picheny
ODL
11
89
0
16 Oct 2018
Trellis Networks for Sequence Modeling
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
15
145
0
15 Oct 2018
A System for Massively Parallel Hyperparameter Tuning
A System for Massively Parallel Hyperparameter Tuning
Liam Li
Kevin G. Jamieson
Afshin Rostamizadeh
Ekaterina Gonina
Moritz Hardt
Benjamin Recht
Ameet Talwalkar
13
370
0
13 Oct 2018
Dropout as a Structured Shrinkage Prior
Dropout as a Structured Shrinkage Prior
Eric T. Nalisnick
José Miguel Hernández-Lobato
Padhraic Smyth
BDL
UQCV
4
1
0
09 Oct 2018
Understanding Recurrent Neural Architectures by Analyzing and
  Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Abhijit Mahalunkar
John D. Kelleher
19
8
0
06 Oct 2018
Adaptive Pruning of Neural Language Models for Mobile Devices
Adaptive Pruning of Neural Language Models for Mobile Devices
Raphael Tang
Jimmy J. Lin
16
6
0
27 Sep 2018
Information-Weighted Neural Cache Language Models for ASR
Information-Weighted Neural Cache Language Models for ASR
Lyan Verwimp
J. Pelemans
Hugo Van hamme
P. Wambacq
KELM
RALM
9
2
0
24 Sep 2018
Multi-task Learning with Sample Re-weighting for Machine Reading
  Comprehension
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu
Xiaodong Liu
Yelong Shen
Jingjing Liu
Jianfeng Gao
19
51
0
18 Sep 2018
FRAGE: Frequency-Agnostic Word Representation
FRAGE: Frequency-Agnostic Word Representation
Chengyue Gong
Di He
Xu Tan
Tao Qin
Liwei Wang
Tie-Yan Liu
OOD
18
144
0
18 Sep 2018
Towards JointUD: Part-of-speech Tagging and Lemmatization using
  Recurrent Neural Networks
Towards JointUD: Part-of-speech Tagging and Lemmatization using Recurrent Neural Networks
G. Arakelyan
Karen Hambardzumyan
Hrant Khachatrian
17
9
0
10 Sep 2018
MTNT: A Testbed for Machine Translation of Noisy Text
MTNT: A Testbed for Machine Translation of Noisy Text
Paul Michel
Graham Neubig
11
145
0
02 Sep 2018
Direct Output Connection for a High-Rank Language Model
Direct Output Connection for a High-Rank Language Model
Sho Takase
Jun Suzuki
Masaaki Nagata
18
36
0
30 Aug 2018
Pyramidal Recurrent Unit for Language Modeling
Pyramidal Recurrent Unit for Language Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
19
10
0
27 Aug 2018
Dissecting Contextual Word Embeddings: Architecture and Representation
Dissecting Contextual Word Embeddings: Architecture and Representation
Matthew E. Peters
Mark Neumann
Luke Zettlemoyer
Wen-tau Yih
13
425
0
27 Aug 2018
Predefined Sparseness in Recurrent Sequence Models
Predefined Sparseness in Recurrent Sequence Models
T. Demeester
Johannes Deleu
Fréderic Godin
Chris Develder
11
3
0
27 Aug 2018
Financial Aspect-Based Sentiment Analysis using Deep Representations
Financial Aspect-Based Sentiment Analysis using Deep Representations
Steven Yang
Jason Rosenfeld
Jacques Makutonin
13
13
0
23 Aug 2018
Improving Abstraction in Text Summarization
Improving Abstraction in Text Summarization
Wojciech Kry'sciñski
Romain Paulus
Caiming Xiong
R. Socher
11
147
0
23 Aug 2018
Neural Architecture Optimization
Neural Architecture Optimization
Renqian Luo
Fei Tian
Tao Qin
Enhong Chen
Tie-Yan Liu
3DV
26
648
0
22 Aug 2018
Don't Use Large Mini-Batches, Use Local SGD
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
36
429
0
22 Aug 2018
Previous
123...101189
Next