Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.08240
Cited By
An Analysis of Neural Language Modeling at Multiple Scales
22 March 2018
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Analysis of Neural Language Modeling at Multiple Scales"
50 / 102 papers shown
Title
Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator
Akshat Ramachandran
Souvik Kundu
Arnab Raha
Shamik Kundu
Deepak K. Mathaikutty
Tushar Krishna
27
1
0
19 Apr 2025
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN-Transformer Architectures
Gabriel Lindenmaier
Sean Papay
Sebastian Padó
53
0
0
02 Feb 2025
Linear Log-Normal Attention with Unbiased Concentration
Yury Nahshan
Dor-Joseph Kampeas
E. Haleva
22
7
0
22 Nov 2023
Minimal Effective Theory for Phonotactic Memory: Capturing Local Correlations due to Errors in Speech
Paul Myles Eugenio
13
1
0
04 Sep 2023
End-to-End Speech Recognition: A Survey
Rohit Prabhavalkar
Takaaki Hori
Tara N. Sainath
Ralf Schluter
Shinji Watanabe
VLM
21
148
0
03 Mar 2023
Generative Adversarial Training Can Improve Neural Language Models
Sajad Movahedi
A. Shakery
GAN
AI4CE
18
2
0
02 Nov 2022
N
N
N
-gram Is Back: Residual Learning of Neural Text Generation with
n
n
n
-gram Language Model
Huayang Li
Deng Cai
J. Xu
Taro Watanabe
VLM
29
1
0
26 Oct 2022
Your Transformer May Not be as Powerful as You Expect
Shengjie Luo
Shanda Li
Shuxin Zheng
Tie-Yan Liu
Liwei Wang
Di He
52
50
0
26 May 2022
Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil
Shantanu Patankar
Omkar Gokhale
Onkar Litake
Aditya Mandke
Dipali M. Kadam
13
6
0
19 Apr 2022
Optimize_Prime@DravidianLangTech-ACL2022: Emotion Analysis in Tamil
Omkar Gokhale
Shantanu Patankar
Onkar Litake
Aditya Mandke
Dipali M. Kadam
14
1
0
19 Apr 2022
Training and Generating Neural Networks in Compressed Weight Space
Kazuki Irie
Jürgen Schmidhuber
11
4
0
31 Dec 2021
Predicting the utility of search spaces for black-box optimization: a simple, budget-aware approach
Setareh Ariafar
Justin Gilmer
Zachary Nado
Jasper Snoek
Rodolphe Jenatton
George E. Dahl
38
1
0
15 Dec 2021
Language Modelling via Learning to Rank
A. Frydenlund
Gagandeep Singh
Frank Rudzicz
45
7
0
13 Oct 2021
Working Memory Connections for LSTM
Federico Landi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
KELM
15
156
0
31 Aug 2021
Machine Unlearning of Features and Labels
Alexander Warnecke
Lukas Pirch
Christian Wressnegger
Konrad Rieck
MU
6
171
0
26 Aug 2021
Towards Zero-shot Language Modeling
E. Ponti
Ivan Vulić
Ryan Cotterell
Roi Reichart
Anna Korhonen
22
19
0
06 Aug 2021
Exploring Self-Identified Counseling Expertise in Online Support Forums
Allison Lahnala
Yuntian Zhao
Charles F Welch
Jonathan K. Kummerfeld
Lawrence C. An
Kenneth Resnicow
Rada Mihalcea
Verónica Pérez-Rosas
14
22
0
24 Jun 2021
A Cognitive Regularizer for Language Modeling
Jason W. Wei
Clara Meister
Ryan Cotterell
11
21
0
15 May 2021
Impact of Gender Debiased Word Embeddings in Language Modeling
Christine Basta
Marta R. Costa-jussá
21
4
0
03 May 2021
Revisiting Simple Neural Probabilistic Language Models
Simeng Sun
Mohit Iyyer
24
14
0
08 Apr 2021
Low-Resource Language Modelling of South African Languages
Stuart Mesham
Luc Hayward
Jared Shapiro
Jan Buys
4
14
0
01 Apr 2021
Contextual Text Embeddings for Twi
P. Azunre
Salomey Osei
S. Addo
Lawrence Asamoah Adu-Gyamfi
Stephen E. Moore
...
Standylove Birago Mensah
Lucien Mensah
Mark Amoako Marcel
A. Amponsah
J. B. Hayfron-Acquah
13
6
0
29 Mar 2021
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
14
48
0
01 Jan 2021
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
60
52
0
31 Dec 2020
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
39
47
0
11 Dec 2020
On Extending NLP Techniques from the Categorical to the Latent Space: KL Divergence, Zipf's Law, and Similarity Search
Adam Hare
Yu Chen
Yinan Liu
Zhenming Liu
Christopher G. Brinton
40
2
0
02 Dec 2020
Exploring the Value of Personalized Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
7
15
0
11 Nov 2020
E.T.: Entity-Transformers. Coreference augmented Neural Language Model for richer mention representations via Entity-Transformer blocks
Nikolaos Stylianou
I. Vlahavas
14
3
0
10 Nov 2020
Compositional Demographic Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
13
31
0
06 Oct 2020
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches
Juan Cruz-Benito
Sanjay Vishwakarma
Francisco Martín-Fernández
Ismael Faro Ibm Quantum
22
30
0
16 Sep 2020
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Sahar Abdelnabi
Mario Fritz
WaLM
18
143
0
07 Sep 2020
Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling
Tsendsuren Munkhdalai
CLL
OffRL
9
4
0
03 Sep 2020
DeLighT: Deep and Light-weight Transformer
Sachin Mehta
Marjan Ghazvininejad
Srini Iyer
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
17
32
0
03 Aug 2020
Exploring the Vulnerability of Deep Neural Networks: A Study of Parameter Corruption
Xu Sun
Zhiyuan Zhang
Xuancheng Ren
Ruixuan Luo
Liangyou Li
17
39
0
10 Jun 2020
Transfer Learning for British Sign Language Modelling
B. Mocialov
Graham Turner
H. Hastie
SLR
16
18
0
03 Jun 2020
Neural Polysynthetic Language Modelling
Lane Schwartz
Francis M. Tyers
Lori S. Levin
Christo Kirov
Patrick Littell
...
Vasilisa Andriyanets
Aldrian Obaja Muis
Naoki Otani
J. Park
Zhisong Zhang
11
24
0
11 May 2020
Phonotactic Complexity and its Trade-offs
Tiago Pimentel
Brian Roark
Ryan Cotterell
12
37
0
07 May 2020
Learning Architectures from an Extended Search Space for Language Modeling
Yinqiao Li
Chi Hu
Yuhao Zhang
Nuo Xu
Yufan Jiang
Tong Xiao
Jingbo Zhu
Tongran Liu
Changliang Li
14
10
0
06 May 2020
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OOD
AI4CE
35
120
0
26 Mar 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
243
579
0
12 Mar 2020
The Implicit and Explicit Regularization Effects of Dropout
Colin Wei
Sham Kakade
Tengyu Ma
19
114
0
28 Feb 2020
Addressing Some Limitations of Transformers with Feedback Memory
Angela Fan
Thibaut Lavril
Edouard Grave
Armand Joulin
Sainbayar Sukhbaatar
18
11
0
21 Feb 2020
Time-aware Large Kernel Convolutions
Vasileios Lioutas
Yuhong Guo
AI4TS
8
29
0
08 Feb 2020
Domain-independent Dominance of Adaptive Methods
Pedro H. P. Savarese
David A. McAllester
Sudarshan Babu
Michael Maire
ODL
8
21
0
04 Dec 2019
GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
Masato Hagiwara
Masato Mita
22
28
0
28 Nov 2019
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
AI4TS
21
23
0
27 Nov 2019
Single Headed Attention RNN: Stop Thinking With Your Head
Stephen Merity
14
68
0
26 Nov 2019
A Multi-language Platform for Generating Algebraic Mathematical Word Problems
Vijini Liyanage
Surangika Ranathunga
11
7
0
19 Nov 2019
Structured Pruning of Large Language Models
Ziheng Wang
Jeremy Wohlwend
Tao Lei
24
280
0
10 Oct 2019
Better Document-Level Machine Translation with Bayes' Rule
Lei Yu
Laurent Sartran
Wojciech Stokowiec
Wang Ling
Lingpeng Kong
Phil Blunsom
Chris Dyer
11
7
0
01 Oct 2019
1
2
3
Next