ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.08240
  4. Cited By
An Analysis of Neural Language Modeling at Multiple Scales

An Analysis of Neural Language Modeling at Multiple Scales

22 March 2018
Stephen Merity
N. Keskar
R. Socher
ArXivPDFHTML

Papers citing "An Analysis of Neural Language Modeling at Multiple Scales"

50 / 102 papers shown
Title
Learning Dynamic Author Representations with Temporal Language Models
Learning Dynamic Author Representations with Temporal Language Models
E. Delasalles
Sylvain Lamprier
Ludovic Denoyer
17
9
0
11 Sep 2019
Deep Equilibrium Models
Deep Equilibrium Models
Shaojie Bai
J. Zico Kolter
V. Koltun
14
658
0
03 Sep 2019
Incorporating Word and Subword Units in Unsupervised Machine Translation
  Using Language Model Rescoring
Incorporating Word and Subword Units in Unsupervised Machine Translation Using Language Model Rescoring
Zihan Liu
Yan Xu
Genta Indra Winata
Pascale Fung
13
22
0
16 Aug 2019
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete
  Attributes
Deep Reinforcement Learning For Modeling Chit-Chat Dialog With Discrete Attributes
Chinnadhurai Sankar
Sujith Ravi
OffRL
21
33
0
05 Jul 2019
Augmenting Self-attention with Persistent Memory
Augmenting Self-attention with Persistent Memory
Sainbayar Sukhbaatar
Edouard Grave
Guillaume Lample
Hervé Jégou
Armand Joulin
RALM
KELM
16
135
0
02 Jul 2019
A Tensorized Transformer for Language Modeling
A Tensorized Transformer for Language Modeling
Xindian Ma
Peng Zhang
Shuai Zhang
Nan Duan
Yuexian Hou
D. Song
M. Zhou
16
162
0
24 Jun 2019
Evaluating Computational Language Models with Scaling Properties of
  Natural Language
Evaluating Computational Language Models with Scaling Properties of Natural Language
Shuntaro Takahashi
Kumiko Tanaka-Ishii
8
23
0
22 Jun 2019
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level
  Neural Language Models Trained on Unsegmented Text
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text
Michael Hahn
Marco Baroni
LMTD
20
15
0
17 Jun 2019
Meaning to Form: Measuring Systematicity as Information
Meaning to Form: Measuring Systematicity as Information
Tiago Pimentel
Arya D. McCarthy
Damián E. Blasi
Brian Roark
Ryan Cotterell
12
33
0
13 Jun 2019
Character n-gram Embeddings to Improve RNN Language Models
Character n-gram Embeddings to Improve RNN Language Models
Sho Takase
Jun Suzuki
Masaaki Nagata
17
25
0
13 Jun 2019
What Kind of Language Is Hard to Language-Model?
What Kind of Language Is Hard to Language-Model?
Sabrina J. Mielke
Ryan Cotterell
Kyle Gorman
Brian Roark
Jason Eisner
6
75
0
11 Jun 2019
Calibration, Entropy Rates, and Memory in Language Models
Calibration, Entropy Rates, and Memory in Language Models
M. Braverman
Xinyi Chen
Sham Kakade
Karthik Narasimhan
Cyril Zhang
Yi Zhang
9
38
0
11 Jun 2019
Improving Neural Language Modeling via Adversarial Training
Improving Neural Language Modeling via Adversarial Training
Dilin Wang
Chengyue Gong
Qiang Liu
AAML
35
115
0
10 Jun 2019
Improving Neural Language Models by Segmenting, Attending, and
  Predicting the Future
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future
Hongyin Luo
Lan Jiang
Yonatan Belinkov
James R. Glass
6
13
0
04 Jun 2019
Improved memory in recurrent neural networks with sequential non-normal
  dynamics
Improved memory in recurrent neural networks with sequential non-normal dynamics
A. Orhan
Xaq Pitkow
8
13
0
31 May 2019
Rethinking Full Connectivity in Recurrent Neural Networks
Rethinking Full Connectivity in Recurrent Neural Networks
Matthijs Van Keirsbilck
A. Keller
Xiaodong Yang
LRM
11
13
0
29 May 2019
Non-normal Recurrent Neural Network (nnRNN): learning long time
  dependencies while improving expressivity with transient dynamics
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics
Giancarlo Kerg
Kyle Goyette
M. P. Touzel
Gauthier Gidel
Eugene Vorontsov
Yoshua Bengio
Guillaume Lajoie
11
58
0
28 May 2019
Discrete Flows: Invertible Generative Models of Discrete Data
Discrete Flows: Invertible Generative Models of Discrete Data
Dustin Tran
Keyon Vafa
Kumar Krishna Agrawal
Laurent Dinh
Ben Poole
DRL
19
114
0
24 May 2019
Efficient Optimization of Loops and Limits with Randomized Telescoping
  Sums
Efficient Optimization of Loops and Limits with Randomized Telescoping Sums
Alex Beatson
Ryan P. Adams
9
21
0
16 May 2019
A Review of Keyphrase Extraction
A Review of Keyphrase Extraction
Eirini Papagiannopoulou
Grigorios Tsoumakas
13
166
0
13 May 2019
Dynamic Evaluation of Transformer Language Models
Dynamic Evaluation of Transformer Language Models
Ben Krause
Emmanuel Kahembwe
Iain Murray
Steve Renals
13
42
0
17 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
23
3,125
0
01 Apr 2019
Zeno++: Robust Fully Asynchronous SGD
Zeno++: Robust Fully Asynchronous SGD
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
FedML
11
106
0
17 Mar 2019
Efficient Contextual Representation Learning Without Softmax Layer
Efficient Contextual Representation Learning Without Softmax Layer
Liunian Harold Li
Patrick H. Chen
Cho-Jui Hsieh
Kai-Wei Chang
15
6
0
28 Feb 2019
Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning
Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning
Frederik Benzing
M. Gauy
Asier Mujika
A. Martinsson
Angelika Steger
17
22
0
11 Feb 2019
Character-based Surprisal as a Model of Reading Difficulty in the
  Presence of Error
Character-based Surprisal as a Model of Reading Difficulty in the Presence of Error
Michael Hahn
Frank Keller
Yonatan Bisk
Yonatan Belinkov
35
2
0
02 Feb 2019
Hardware-Guided Symbiotic Training for Compact, Accurate, yet
  Execution-Efficient LSTM
Hardware-Guided Symbiotic Training for Compact, Accurate, yet Execution-Efficient LSTM
Hongxu Yin
Guoyang Chen
Yingmin Li
Shuai Che
Weifeng Zhang
N. Jha
28
10
0
30 Jan 2019
Latent Normalizing Flows for Discrete Sequences
Latent Normalizing Flows for Discrete Sequences
Zachary M. Ziegler
Alexander M. Rush
BDL
DRL
11
122
0
29 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
10
3,671
0
09 Jan 2019
Deep Anomaly Detection with Outlier Exposure
Deep Anomaly Detection with Outlier Exposure
Dan Hendrycks
Mantas Mazeika
Thomas G. Dietterich
OODD
31
1,450
0
11 Dec 2018
Analysing Dropout and Compounding Errors in Neural Language Models
Analysing Dropout and Compounding Errors in Neural Language Models
James OÑeill
Danushka Bollegala
20
1
0
02 Nov 2018
Progress and Tradeoffs in Neural Language Models
Progress and Tradeoffs in Neural Language Models
Raphael Tang
Jimmy J. Lin
10
5
0
02 Nov 2018
Language Modeling at Scale
Language Modeling at Scale
Md. Mostofa Ali Patwary
Milind Chabbi
Heewoo Jun
Jiaji Huang
G. Diamos
Kenneth Ward Church
ALM
12
5
0
23 Oct 2018
Trellis Networks for Sequence Modeling
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
15
145
0
15 Oct 2018
signSGD with Majority Vote is Communication Efficient And Fault Tolerant
signSGD with Majority Vote is Communication Efficient And Fault Tolerant
Jeremy Bernstein
Jiawei Zhao
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
14
46
0
11 Oct 2018
Understanding Recurrent Neural Architectures by Analyzing and
  Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Abhijit Mahalunkar
John D. Kelleher
19
8
0
06 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
16
386
0
28 Sep 2018
Adaptive Pruning of Neural Language Models for Mobile Devices
Adaptive Pruning of Neural Language Models for Mobile Devices
Raphael Tang
Jimmy J. Lin
16
6
0
27 Sep 2018
How clever is the FiLM model, and how clever can it be?
How clever is the FiLM model, and how clever can it be?
A. Kuhnle
Huiyuan Xie
Ann A. Copestake
19
6
0
09 Sep 2018
Under the Hood: Using Diagnostic Classifiers to Investigate and Improve
  how Language Models Track Agreement Information
Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information
Mario Giulianelli
J. Harding
Florian Mohnert
Dieuwke Hupkes
Willem H. Zuidema
22
189
0
24 Aug 2018
Improved Language Modeling by Decoding the Past
Improved Language Modeling by Decoding the Past
Siddhartha Brahma
BDL
AI4TS
6
6
0
14 Aug 2018
On Training Recurrent Networks with Truncated Backpropagation Through
  Time in Speech Recognition
On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition
Hao Tang
James R. Glass
8
19
0
09 Jul 2018
Insights on representational similarity in neural networks with
  canonical correlation
Insights on representational similarity in neural networks with canonical correlation
Ari S. Morcos
M. Raghu
Samy Bengio
DRL
21
429
0
14 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of
  Neural Language Models
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
23
30
0
11 Jun 2018
Efficient Full-Matrix Adaptive Regularization
Efficient Full-Matrix Adaptive Regularization
Naman Agarwal
Brian Bullins
Xinyi Chen
Elad Hazan
Karan Singh
Cyril Zhang
Yi Zhang
11
21
0
08 Jun 2018
Approximating Real-Time Recurrent Learning with Random Kronecker Factors
Approximating Real-Time Recurrent Learning with Random Kronecker Factors
Asier Mujika
Florian Meier
Angelika Steger
9
60
0
28 May 2018
State Gradients for RNN Memory Analysis
State Gradients for RNN Memory Analysis
Lyan Verwimp
Hugo Van hamme
Vincent Renkens
P. Wambacq
6
6
0
11 May 2018
The Secret Sharer: Evaluating and Testing Unintended Memorization in
  Neural Networks
The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks
Nicholas Carlini
Chang-rui Liu
Ulfar Erlingsson
Jernej Kos
D. Song
48
1,111
0
22 Feb 2018
Simple Recurrent Units for Highly Parallelizable Recurrence
Simple Recurrent Units for Highly Parallelizable Recurrence
Tao Lei
Yu Zhang
Sida I. Wang
Huijing Dai
Yoav Artzi
LRM
42
271
0
08 Sep 2017
Natural Language Processing: State of The Art, Current Trends and
  Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
Diksha Khurana
Aditya Koli
Kiran Khatter
Sukhdev Singh
17
1,023
0
17 Aug 2017
Previous
123
Next