Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.02182
Cited By
Regularizing and Optimizing LSTM Language Models
7 August 2017
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing and Optimizing LSTM Language Models"
50 / 508 papers shown
Title
Broccoli: Sprinkling Lightweight Vocabulary Learning into Everyday Information Diets
Roland Aydin
Lars Klein
Arnaud Miribel
Robert West
11
1
0
16 Apr 2021
RIANN -- A Robust Neural Network Outperforms Attitude Estimation Filters
Daniel Weber
C. Gühmann
Thomas Seel
12
35
0
15 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
17
84
0
13 Apr 2021
Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding
Philipp Koehn
FAtt
XAI
21
54
0
12 Apr 2021
Revisiting Simple Neural Probabilistic Language Models
Simeng Sun
Mohit Iyyer
24
14
0
08 Apr 2021
Rethinking Perturbations in Encoder-Decoders for Fast Training
Sho Takase
Shun Kiyono
16
45
0
05 Apr 2021
Low-Resource Language Modelling of South African Languages
Stuart Mesham
Luc Hayward
Jared Shapiro
Jan Buys
4
14
0
01 Apr 2021
Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment Analysis
Tomas Liesting
Flavius Frasincar
Maria Mihaela Truşcǎ
11
30
0
29 Mar 2021
Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers
Markus Bayer
M. Kaufhold
Björn Buchhold
Marcel Keller
J. Dallmeyer
Christian A. Reuter
15
113
0
26 Mar 2021
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning
O. Lutz
Huili Chen
Hossein Fereidooni
Christoph Sendner
Alexandra Dmitrienko
A. Sadeghi
F. Koushanfar
8
46
0
23 Mar 2021
Token-wise Curriculum Learning for Neural Machine Translation
Chen Liang
Haoming Jiang
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
T. Zhao
21
4
0
20 Mar 2021
Improving Authorship Verification using Linguistic Divergence
Yifan Zhang
Dainis Boumber
Marjan Hosseinia
Fan Yang
Arjun Mukherjee
10
1
0
12 Mar 2021
Nondeterminism and Instability in Neural Network Optimization
Cecilia Summers
M. Dinneen
19
38
0
08 Mar 2021
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
19
348
0
03 Mar 2021
indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages
K. Kedia
Abhilash Nandy
13
23
0
14 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers
Shucong Zhang
Cong-Thanh Do
R. Doddipatla
Erfan Loweimi
P. Bell
Steve Renals
11
2
0
09 Feb 2021
Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise
Xingyu Wang
Sewoong Oh
C. Rhee
11
13
0
08 Feb 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
K. E. Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
26
95
0
22 Jan 2021
Detecting Hostile Posts using Relational Graph Convolutional Network
Sarthak
Shikhar Shukla
K. V. Arya
GNN
6
2
0
10 Jan 2021
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
Hieu H. Pham
Quoc V. Le
70
56
0
05 Jan 2021
Leveraging Audio Gestalt to Predict Media Memorability
Lorin Sweeney
Graham Healy
A. Smeaton
19
6
0
31 Dec 2020
Contextual Temperature for Language Modeling
Pei-Hsin Wang
Sheng-Iou Hsieh
Shih-Chieh Chang
Yu-Ting Chen
Jia-Yu Pan
Wei Wei
Da-Chang Juan
29
25
0
25 Dec 2020
Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent
Haichao Zhang
K. Hao
Lei Gao
Bing Wei
Xue-song Tang
14
12
0
21 Dec 2020
Recent advances in deep learning theory
Fengxiang He
Dacheng Tao
AI4CE
13
50
0
20 Dec 2020
Data-Efficient Methods for Dialogue Systems
Igor Shalyminov
12
0
0
05 Dec 2020
End to End ASR System with Automatic Punctuation Insertion
Yushi Guan
3DV
11
5
0
03 Dec 2020
Mutual Information Constraints for Monte-Carlo Objectives
Gábor Melis
András Gyorgy
Phil Blunsom
14
1
0
01 Dec 2020
Regularizing Recurrent Neural Networks via Sequence Mixup
Armin Karamzade
Amir Najafi
S. Motahari
11
0
0
27 Nov 2020
Learning Associative Inference Using Fast Weight Memory
Imanol Schlag
Tsendsuren Munkhdalai
Jürgen Schmidhuber
KELM
17
44
0
16 Nov 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Mohit Bansal
OffRL
27
6
0
15 Nov 2020
Exploring the Value of Personalized Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
7
15
0
11 Nov 2020
Scaling Hidden Markov Language Models
Justin T. Chiu
Alexander M. Rush
BDL
14
25
0
09 Nov 2020
Fusion Models for Improved Visual Captioning
M. Kalimuthu
Aditya Mogadala
Marius Mosbach
Dietrich Klakow
VLM
26
0
0
28 Oct 2020
Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians
Juhan Bae
Roger C. Grosse
16
24
0
26 Oct 2020
Revisiting Neural Language Modelling with Syllables
Arturo Oncevay
Kervy Rivas Rojas
11
2
0
24 Oct 2020
Large Scale Legal Text Classification Using Transformer Models
Zein Shaheen
G. Wohlgenannt
Erwin Filtz
AILaw
21
67
0
24 Oct 2020
On Convergence and Generalization of Dropout Training
Poorya Mianjy
R. Arora
24
30
0
23 Oct 2020
Exploiting News Article Structure for Automatic Corpus Generation of Entailment Datasets
Jan Christian Blaise Cruz
Jose Kristian Resabal
James Lin
Dan John Velasco
C. Cheng
4
11
0
22 Oct 2020
Cascaded Models With Cyclic Feedback For Direct Speech Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
22
12
0
21 Oct 2020
Adaptive Gradient Method with Resilience and Momentum
Jie Liu
Chen Lin
Chuming Li
Lu Sheng
Ming-hui Sun
Junjie Yan
Wanli Ouyang
ODL
6
0
0
21 Oct 2020
Complaint Identification in Social Media with Transformer Networks
Mali Jin
Nikolaos Aletras
12
16
0
21 Oct 2020
Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data
George Michalopoulos
Helen H. Chen
Alexander Wong
MedIm
12
1
0
15 Oct 2020
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language
Dan John Velasco
4
3
0
13 Oct 2020
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
S. Hoi
E. Weinan
23
227
0
12 Oct 2020
Compositional Demographic Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
13
31
0
06 Oct 2020
On the Branching Bias of Syntax Extracted from Pre-trained Language Models
Huayang Li
Lemao Liu
Guoping Huang
Shuming Shi
18
6
0
06 Oct 2020
Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gaurav Arora
6
27
0
05 Oct 2020
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
30
86
0
05 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation
Charles F Welch
Rada Mihalcea
Jonathan K. Kummerfeld
AI4CE
11
4
0
29 Sep 2020
Previous
1
2
3
4
5
...
9
10
11
Next