ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.08240
  4. Cited By
An Analysis of Neural Language Modeling at Multiple Scales

An Analysis of Neural Language Modeling at Multiple Scales

22 March 2018
Stephen Merity
N. Keskar
R. Socher
ArXiv (abs)PDFHTML

Papers citing "An Analysis of Neural Language Modeling at Multiple Scales"

50 / 101 papers shown
Better Document-Level Machine Translation with Bayes' Rule
Better Document-Level Machine Translation with Bayes' Rule
Lei Yu
Laurent Sartran
Wojciech Stokowiec
Wang Ling
Lingpeng Kong
Phil Blunsom
Chris Dyer
151
7
0
01 Oct 2019
Learning Dynamic Author Representations with Temporal Language Models
Learning Dynamic Author Representations with Temporal Language ModelsIndustrial Conference on Data Mining (IDM), 2019
E. Delasalles
Sylvain Lamprier
Ludovic Denoyer
125
10
0
11 Sep 2019
Deep Equilibrium Models
Deep Equilibrium ModelsNeural Information Processing Systems (NeurIPS), 2019
Shaojie Bai
J. Zico Kolter
V. Koltun
219
768
0
03 Sep 2019
Incorporating Word and Subword Units in Unsupervised Machine Translation
  Using Language Model Rescoring
Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model RescoringConference on Machine Translation (WMT), 2019
Zihan Liu
Yan Xu
Genta Indra Winata
Pascale Fung
336
22
0
16 Aug 2019
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete
  Attributes
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete AttributesSIGDIAL Conferences (SIGDIAL), 2019
Chinnadhurai Sankar
Sujith Ravi
OffRL
197
33
0
05 Jul 2019
Augmenting Self-attention with Persistent Memory
Augmenting Self-attention with Persistent Memory
Sainbayar Sukhbaatar
Edouard Grave
Guillaume Lample
Edouard Grave
Armand Joulin
RALMKELM
221
149
0
02 Jul 2019
A Tensorized Transformer for Language Modeling
A Tensorized Transformer for Language ModelingNeural Information Processing Systems (NeurIPS), 2019
Xindian Ma
Peng Zhang
Shuai Zhang
Nan Duan
Yuexian Hou
D. Song
M. Zhou
346
186
0
24 Jun 2019
Evaluating Computational Language Models with Scaling Properties of
  Natural Language
Evaluating Computational Language Models with Scaling Properties of Natural LanguageComputational Linguistics (CL), 2019
Shuntaro Takahashi
Kumiko Tanaka-Ishii
156
34
0
22 Jun 2019
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level
  Neural Language Models Trained on Unsegmented Text
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented TextTransactions of the Association for Computational Linguistics (TACL), 2019
Michael Hahn
Marco Baroni
LMTD
79
15
0
17 Jun 2019
Meaning to Form: Measuring Systematicity as Information
Meaning to Form: Measuring Systematicity as InformationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Tiago Pimentel
Arya D. McCarthy
Damián E. Blasi
Brian Roark
Robert Bamler
129
39
0
13 Jun 2019
Character n-gram Embeddings to Improve RNN Language Models
Character n-gram Embeddings to Improve RNN Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2019
Sho Takase
Jun Suzuki
Masaaki Nagata
143
25
0
13 Jun 2019
What Kind of Language Is Hard to Language-Model?
What Kind of Language Is Hard to Language-Model?Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Sabrina J. Mielke
Robert Bamler
Kyle Gorman
Brian Roark
Jason Eisner
290
88
0
11 Jun 2019
Calibration, Entropy Rates, and Memory in Language Models
Calibration, Entropy Rates, and Memory in Language ModelsInternational Conference on Machine Learning (ICML), 2019
M. Braverman
Xinyi Chen
Sham Kakade
Karthik Narasimhan
Cyril Zhang
Yi Zhang
224
43
0
11 Jun 2019
Improving Neural Language Modeling via Adversarial Training
Improving Neural Language Modeling via Adversarial TrainingInternational Conference on Machine Learning (ICML), 2019
Dilin Wang
Chengyue Gong
Qiang Liu
AAML
286
122
0
10 Jun 2019
Improving Neural Language Models by Segmenting, Attending, and
  Predicting the Future
Improving Neural Language Models by Segmenting, Attending, and Predicting the FutureAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Hongyin Luo
Lan Jiang
Yonatan Belinkov
James R. Glass
146
14
0
04 Jun 2019
Improved memory in recurrent neural networks with sequential non-normal
  dynamics
Improved memory in recurrent neural networks with sequential non-normal dynamicsInternational Conference on Learning Representations (ICLR), 2019
A. Orhan
Xaq Pitkow
201
16
0
31 May 2019
Rethinking Full Connectivity in Recurrent Neural Networks
Rethinking Full Connectivity in Recurrent Neural Networks
Matthijs Van Keirsbilck
A. Keller
Xiaodong Yang
LRM
133
16
0
29 May 2019
Non-normal Recurrent Neural Network (nnRNN): learning long time
  dependencies while improving expressivity with transient dynamics
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamicsNeural Information Processing Systems (NeurIPS), 2019
Giancarlo Kerg
Kyle Goyette
M. P. Touzel
Gauthier Gidel
Eugene Vorontsov
Yoshua Bengio
Guillaume Lajoie
226
64
0
28 May 2019
Discrete Flows: Invertible Generative Models of Discrete Data
Discrete Flows: Invertible Generative Models of Discrete Data
Dustin Tran
Keyon Vafa
Kumar Krishna Agrawal
Laurent Dinh
Ben Poole
DRL
265
123
0
24 May 2019
Efficient Optimization of Loops and Limits with Randomized Telescoping
  Sums
Efficient Optimization of Loops and Limits with Randomized Telescoping SumsInternational Conference on Machine Learning (ICML), 2019
Alex Beatson
Ryan P. Adams
141
21
0
16 May 2019
A Review of Keyphrase Extraction
A Review of Keyphrase Extraction
Eirini Papagiannopoulou
Grigorios Tsoumakas
209
184
0
13 May 2019
Dynamic Evaluation of Transformer Language Models
Dynamic Evaluation of Transformer Language Models
Ben Krause
Emmanuel Kahembwe
Iain Murray
Steve Renals
212
45
0
17 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLMFaML
535
3,315
0
01 Apr 2019
Zeno++: Robust Fully Asynchronous SGD
Zeno++: Robust Fully Asynchronous SGD
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
FedML
350
127
0
17 Mar 2019
Efficient Contextual Representation Learning Without Softmax Layer
Efficient Contextual Representation Learning Without Softmax Layer
Liunian Harold Li
Patrick H. Chen
Cho-Jui Hsieh
Kai-Wei Chang
111
6
0
28 Feb 2019
Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning
Optimal Kronecker-Sum Approximation of Real Time Recurrent LearningInternational Conference on Machine Learning (ICML), 2019
Frederik Benzing
M. Gauy
Asier Mujika
A. Martinsson
Angelika Steger
232
28
0
11 Feb 2019
Character-based Surprisal as a Model of Reading Difficulty in the
  Presence of Error
Character-based Surprisal as a Model of Reading Difficulty in the Presence of Error
Michael Hahn
Frank Keller
Yonatan Bisk
Yonatan Belinkov
107
2
0
02 Feb 2019
Hardware-Guided Symbiotic Training for Compact, Accurate, yet
  Execution-Efficient LSTM
Hardware-Guided Symbiotic Training for Compact, Accurate, yet Execution-Efficient LSTM
Hongxu Yin
Guoyang Chen
Yingmin Li
Shuai Che
Weifeng Zhang
N. Jha
105
10
0
30 Jan 2019
Latent Normalizing Flows for Discrete Sequences
Latent Normalizing Flows for Discrete SequencesInternational Conference on Machine Learning (ICML), 2019
Zachary M. Ziegler
Alexander M. Rush
BDLDRL
466
131
0
29 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
728
4,111
0
09 Jan 2019
Deep Anomaly Detection with Outlier Exposure
Deep Anomaly Detection with Outlier Exposure
Dan Hendrycks
Mantas Mazeika
Thomas G. Dietterich
OODD
1.7K
1,642
0
11 Dec 2018
Analysing Dropout and Compounding Errors in Neural Language Models
Analysing Dropout and Compounding Errors in Neural Language Models
James OÑeill
Danushka Bollegala
96
1
0
02 Nov 2018
Progress and Tradeoffs in Neural Language Models
Progress and Tradeoffs in Neural Language Models
Raphael Tang
Jimmy J. Lin
96
5
0
02 Nov 2018
Language Modeling at Scale
Language Modeling at Scale
Md. Mostofa Ali Patwary
Milind Chabbi
Heewoo Jun
Jiaji Huang
G. Diamos
Kenneth Church
ALM
97
5
0
23 Oct 2018
Trellis Networks for Sequence Modeling
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
192
156
0
15 Oct 2018
signSGD with Majority Vote is Communication Efficient And Fault Tolerant
signSGD with Majority Vote is Communication Efficient And Fault Tolerant
Jeremy Bernstein
Jiawei Zhao
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
245
49
0
11 Oct 2018
Understanding Recurrent Neural Architectures by Analyzing and
  Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Abhijit Mahalunkar
John D. Kelleher
264
8
0
06 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Adaptive Input Representations for Neural Language ModelingInternational Conference on Learning Representations (ICLR), 2018
Alexei Baevski
Michael Auli
581
422
0
28 Sep 2018
Adaptive Pruning of Neural Language Models for Mobile Devices
Adaptive Pruning of Neural Language Models for Mobile Devices
Raphael Tang
Jimmy J. Lin
121
7
0
27 Sep 2018
How clever is the FiLM model, and how clever can it be?
How clever is the FiLM model, and how clever can it be?
A. Kuhnle
Huiyuan Xie
Ann A. Copestake
151
7
0
09 Sep 2018
Under the Hood: Using Diagnostic Classifiers to Investigate and Improve
  how Language Models Track Agreement Information
Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information
Mario Giulianelli
J. Harding
Florian Mohnert
Dieuwke Hupkes
Willem H. Zuidema
337
196
0
24 Aug 2018
Improved Language Modeling by Decoding the Past
Improved Language Modeling by Decoding the Past
Siddhartha Brahma
BDLAI4TS
271
6
0
14 Aug 2018
On Training Recurrent Networks with Truncated Backpropagation Through
  Time in Speech Recognition
On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition
Hao Tang
James R. Glass
136
20
0
09 Jul 2018
Insights on representational similarity in neural networks with
  canonical correlation
Insights on representational similarity in neural networks with canonical correlation
Ari S. Morcos
M. Raghu
Samy Bengio
DRL
385
482
0
14 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of
  Neural Language Models
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
114
31
0
11 Jun 2018
Efficient Full-Matrix Adaptive Regularization
Efficient Full-Matrix Adaptive Regularization
Naman Agarwal
Brian Bullins
Xinyi Chen
Elad Hazan
Karan Singh
Cyril Zhang
Yi Zhang
130
22
0
08 Jun 2018
Approximating Real-Time Recurrent Learning with Random Kronecker Factors
Approximating Real-Time Recurrent Learning with Random Kronecker Factors
Asier Mujika
Florian Meier
Angelika Steger
263
64
0
28 May 2018
State Gradients for RNN Memory Analysis
State Gradients for RNN Memory Analysis
Lyan Verwimp
Hugo Van hamme
Vincent Renkens
P. Wambacq
170
6
0
11 May 2018
The Secret Sharer: Evaluating and Testing Unintended Memorization in
  Neural Networks
The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks
Nicholas Carlini
Chang-rui Liu
Ulfar Erlingsson
Jernej Kos
Basel Alomair
693
1,308
0
22 Feb 2018
Simple Recurrent Units for Highly Parallelizable Recurrence
Simple Recurrent Units for Highly Parallelizable Recurrence
Tao Lei
Yu Zhang
Sida I. Wang
Huijing Dai
Yoav Artzi
LRM
538
295
0
08 Sep 2017
Previous
123
Next