ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.02182
  4. Cited By
Regularizing and Optimizing LSTM Language Models

Regularizing and Optimizing LSTM Language Models

7 August 2017
Stephen Merity
N. Keskar
R. Socher
ArXivPDFHTML

Papers citing "Regularizing and Optimizing LSTM Language Models"

50 / 508 papers shown
Title
Broccoli: Sprinkling Lightweight Vocabulary Learning into Everyday
  Information Diets
Broccoli: Sprinkling Lightweight Vocabulary Learning into Everyday Information Diets
Roland Aydin
Lars Klein
Arnaud Miribel
Robert West
11
1
0
16 Apr 2021
RIANN -- A Robust Neural Network Outperforms Attitude Estimation Filters
RIANN -- A Robust Neural Network Outperforms Attitude Estimation Filters
Daniel Weber
C. Gühmann
Thomas Seel
12
35
0
15 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
17
84
0
13 Apr 2021
Evaluating Saliency Methods for Neural Language Models
Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding
Philipp Koehn
FAtt
XAI
21
54
0
12 Apr 2021
Revisiting Simple Neural Probabilistic Language Models
Revisiting Simple Neural Probabilistic Language Models
Simeng Sun
Mohit Iyyer
24
14
0
08 Apr 2021
Rethinking Perturbations in Encoder-Decoders for Fast Training
Rethinking Perturbations in Encoder-Decoders for Fast Training
Sho Takase
Shun Kiyono
16
45
0
05 Apr 2021
Low-Resource Language Modelling of South African Languages
Low-Resource Language Modelling of South African Languages
Stuart Mesham
Luc Hayward
Jared Shapiro
Jan Buys
4
14
0
01 Apr 2021
Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment
  Analysis
Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment Analysis
Tomas Liesting
Flavius Frasincar
Maria Mihaela Truşcǎ
11
30
0
29 Mar 2021
Data Augmentation in Natural Language Processing: A Novel Text
  Generation Approach for Long and Short Text Classifiers
Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers
Markus Bayer
M. Kaufhold
Björn Buchhold
Marcel Keller
J. Dallmeyer
Christian A. Reuter
15
113
0
26 Mar 2021
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
  Neural Network and Transfer Learning
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning
O. Lutz
Huili Chen
Hossein Fereidooni
Christoph Sendner
Alexandra Dmitrienko
A. Sadeghi
F. Koushanfar
8
46
0
23 Mar 2021
Token-wise Curriculum Learning for Neural Machine Translation
Token-wise Curriculum Learning for Neural Machine Translation
Chen Liang
Haoming Jiang
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
T. Zhao
21
4
0
20 Mar 2021
Improving Authorship Verification using Linguistic Divergence
Improving Authorship Verification using Linguistic Divergence
Yifan Zhang
Dainis Boumber
Marjan Hosseinia
Fan Yang
Arjun Mukherjee
10
1
0
12 Mar 2021
Nondeterminism and Instability in Neural Network Optimization
Nondeterminism and Instability in Neural Network Optimization
Cecilia Summers
M. Dinneen
19
38
0
08 Mar 2021
Random Feature Attention
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
19
348
0
03 Mar 2021
indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language
  Identification in Dravidian Languages
indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages
K. Kedia
Abhilash Nandy
13
23
0
14 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Train your classifier first: Cascade Neural Networks Training from upper
  layers to lower layers
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers
Shucong Zhang
Cong-Thanh Do
R. Doddipatla
Erfan Loweimi
P. Bell
Steve Renals
11
2
0
09 Feb 2021
Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise
Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise
Xingyu Wang
Sewoong Oh
C. Rhee
11
13
0
08 Feb 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
K. E. Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
26
95
0
22 Jan 2021
Detecting Hostile Posts using Relational Graph Convolutional Network
Detecting Hostile Posts using Relational Graph Convolutional Network
Sarthak
Shikhar Shukla
K. V. Arya
GNN
6
2
0
10 Jan 2021
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
Hieu H. Pham
Quoc V. Le
70
56
0
05 Jan 2021
Leveraging Audio Gestalt to Predict Media Memorability
Leveraging Audio Gestalt to Predict Media Memorability
Lorin Sweeney
Graham Healy
A. Smeaton
19
6
0
31 Dec 2020
Contextual Temperature for Language Modeling
Contextual Temperature for Language Modeling
Pei-Hsin Wang
Sheng-Iou Hsieh
Shih-Chieh Chang
Yu-Ting Chen
Jia-Yu Pan
Wei Wei
Da-Chang Juan
29
25
0
25 Dec 2020
Optimizing Deep Neural Networks through Neuroevolution with Stochastic
  Gradient Descent
Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent
Haichao Zhang
K. Hao
Lei Gao
Bing Wei
Xue-song Tang
14
12
0
21 Dec 2020
Recent advances in deep learning theory
Recent advances in deep learning theory
Fengxiang He
Dacheng Tao
AI4CE
13
50
0
20 Dec 2020
Data-Efficient Methods for Dialogue Systems
Data-Efficient Methods for Dialogue Systems
Igor Shalyminov
12
0
0
05 Dec 2020
End to End ASR System with Automatic Punctuation Insertion
End to End ASR System with Automatic Punctuation Insertion
Yushi Guan
3DV
11
5
0
03 Dec 2020
Mutual Information Constraints for Monte-Carlo Objectives
Mutual Information Constraints for Monte-Carlo Objectives
Gábor Melis
András Gyorgy
Phil Blunsom
14
1
0
01 Dec 2020
Regularizing Recurrent Neural Networks via Sequence Mixup
Regularizing Recurrent Neural Networks via Sequence Mixup
Armin Karamzade
Amir Najafi
S. Motahari
11
0
0
27 Nov 2020
Learning Associative Inference Using Fast Weight Memory
Learning Associative Inference Using Fast Weight Memory
Imanol Schlag
Tsendsuren Munkhdalai
Jürgen Schmidhuber
KELM
17
44
0
16 Nov 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Mohit Bansal
OffRL
27
6
0
15 Nov 2020
Exploring the Value of Personalized Word Embeddings
Exploring the Value of Personalized Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
7
15
0
11 Nov 2020
Scaling Hidden Markov Language Models
Scaling Hidden Markov Language Models
Justin T. Chiu
Alexander M. Rush
BDL
14
25
0
09 Nov 2020
Fusion Models for Improved Visual Captioning
Fusion Models for Improved Visual Captioning
M. Kalimuthu
Aditya Mogadala
Marius Mosbach
Dietrich Klakow
VLM
26
0
0
28 Oct 2020
Delta-STN: Efficient Bilevel Optimization for Neural Networks using
  Structured Response Jacobians
Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians
Juhan Bae
Roger C. Grosse
16
24
0
26 Oct 2020
Revisiting Neural Language Modelling with Syllables
Revisiting Neural Language Modelling with Syllables
Arturo Oncevay
Kervy Rivas Rojas
11
2
0
24 Oct 2020
Large Scale Legal Text Classification Using Transformer Models
Large Scale Legal Text Classification Using Transformer Models
Zein Shaheen
G. Wohlgenannt
Erwin Filtz
AILaw
21
67
0
24 Oct 2020
On Convergence and Generalization of Dropout Training
On Convergence and Generalization of Dropout Training
Poorya Mianjy
R. Arora
24
30
0
23 Oct 2020
Exploiting News Article Structure for Automatic Corpus Generation of
  Entailment Datasets
Exploiting News Article Structure for Automatic Corpus Generation of Entailment Datasets
Jan Christian Blaise Cruz
Jose Kristian Resabal
James Lin
Dan John Velasco
C. Cheng
4
11
0
22 Oct 2020
Cascaded Models With Cyclic Feedback For Direct Speech Translation
Cascaded Models With Cyclic Feedback For Direct Speech Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
22
12
0
21 Oct 2020
Adaptive Gradient Method with Resilience and Momentum
Adaptive Gradient Method with Resilience and Momentum
Jie Liu
Chen Lin
Chuming Li
Lu Sheng
Ming-hui Sun
Junjie Yan
Wanli Ouyang
ODL
6
0
0
21 Oct 2020
Complaint Identification in Social Media with Transformer Networks
Complaint Identification in Social Media with Transformer Networks
Mali Jin
Nikolaos Aletras
12
16
0
21 Oct 2020
Where's the Question? A Multi-channel Deep Convolutional Neural Network
  for Question Identification in Textual Data
Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data
George Michalopoulos
Helen H. Chen
Alexander Wong
MedIm
12
1
0
15 Oct 2020
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource
  Language
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language
Dan John Velasco
4
3
0
13 Oct 2020
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM
  in Deep Learning
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
S. Hoi
E. Weinan
23
227
0
12 Oct 2020
Compositional Demographic Word Embeddings
Compositional Demographic Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
13
31
0
06 Oct 2020
On the Branching Bias of Syntax Extracted from Pre-trained Language
  Models
On the Branching Bias of Syntax Extracted from Pre-trained Language Models
Huayang Li
Lemao Liu
Guoping Huang
Shuming Shi
18
6
0
06 Oct 2020
Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on
  Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gaurav Arora
6
27
0
05 Oct 2020
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
30
86
0
05 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding
  Initialisation
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation
Charles F Welch
Rada Mihalcea
Jonathan K. Kummerfeld
AI4CE
11
4
0
29 Sep 2020
Previous
12345...91011
Next