ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.02182
  4. Cited By
Regularizing and Optimizing LSTM Language Models

Regularizing and Optimizing LSTM Language Models

7 August 2017
Stephen Merity
N. Keskar
R. Socher
ArXivPDFHTML

Papers citing "Regularizing and Optimizing LSTM Language Models"

50 / 508 papers shown
Title
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Graph Laplacian Wavelet Transformer via Learnable Spectral Decomposition
Andrew Kiruluta
Eric Lundy
Priscilla Burity
24
0
0
09 May 2025
Smoothed Normalization for Efficient Distributed Private Optimization
Smoothed Normalization for Efficient Distributed Private Optimization
Egor Shulgin
Sarit Khirirat
Peter Richtárik
FedML
82
0
0
20 Feb 2025
When, Where and Why to Average Weights?
Niccolò Ajroldi
Antonio Orvieto
Jonas Geiping
MoMe
93
0
0
10 Feb 2025
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN-Transformer Architectures
Efficient Language Modeling for Low-Resource Settings with Hybrid RNN-Transformer Architectures
Gabriel Lindenmaier
Sean Papay
Sebastian Padó
51
0
0
02 Feb 2025
Optimizing Speech-Input Length for Speaker-Independent Depression Classification
Tomasz Rutowski
Amir Harati
Yang Lu
Elizabeth Shriberg
23
15
0
03 Jan 2025
Mask Factory: Towards High-quality Synthetic Data Generation for
  Dichotomous Image Segmentation
Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation
Haotian Qian
YD Chen
Shengtao Lou
F. Khan
Xiaogang Jin
Deng-Ping Fan
DiffM
37
6
0
26 Dec 2024
Robust Speech and Natural Language Processing Models for Depression
  Screening
Robust Speech and Natural Language Processing Models for Depression Screening
Y. Lu
A. Harati
T. Rutowski
R. Oliveira
P. Chlebek
E. Shriberg
AI4MH
39
5
0
26 Dec 2024
Classification of residential and non-residential buildings based on
  satellite data using deep learning
Classification of residential and non-residential buildings based on satellite data using deep learning
Jai G Singla
18
0
0
11 Nov 2024
Don't Just Pay Attention, PLANT It: Transfer L2R Models to Fine-tune
  Attention in Extreme Multi-Label Text Classification
Don't Just Pay Attention, PLANT It: Transfer L2R Models to Fine-tune Attention in Extreme Multi-Label Text Classification
Debjyoti Saharoy
J. Aslam
Virgil Pavlu
VLM
34
0
0
30 Oct 2024
From Gradient Clipping to Normalization for Heavy Tailed SGD
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
40
5
0
17 Oct 2024
Financial Sentiment Analysis on News and Reports Using Large Language
  Models and FinBERT
Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT
Yanxin Shen
Pulin Kirin Zhang
AIFin
24
11
0
02 Oct 2024
Modelando procesos cognitivos de la lectura natural con GPT-2
Modelando procesos cognitivos de la lectura natural con GPT-2
Bruno Bianchi
Alfredo Umfurer
Juan Esteban Kamienkowski
26
0
0
30 Sep 2024
AsthmaBot: Multi-modal, Multi-Lingual Retrieval Augmented Generation For
  Asthma Patient Support
AsthmaBot: Multi-modal, Multi-Lingual Retrieval Augmented Generation For Asthma Patient Support
Adil Bahaj
Mounir Ghogho
38
2
0
24 Sep 2024
Explaining Datasets in Words: Statistical Models with Natural Language Parameters
Explaining Datasets in Words: Statistical Models with Natural Language Parameters
Ruiqi Zhong
Heng Wang
Dan Klein
Jacob Steinhardt
35
6
0
13 Sep 2024
Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language
  Models for Privacy Leakage
Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage
Md. Rafi Ur Rashid
Jing Liu
T. Koike-Akino
Shagufta Mehnaz
Ye Wang
MU
SILM
36
3
0
30 Aug 2024
Interactive Topic Models with Optimal Transport
Interactive Topic Models with Optimal Transport
Garima Dhanania
Sheshera Mysore
Chau Minh Pham
Mohit Iyyer
Hamed Zamani
Andrew McCallum
OT
27
1
0
28 Jun 2024
Hidden Holes: topological aspects of language models
Hidden Holes: topological aspects of language models
Stephen Fitz
P. Romero
Jiyan Jonas Schneider
35
0
0
09 Jun 2024
Thinking Tokens for Language Modeling
Thinking Tokens for Language Modeling
David Herel
Tomáš Mikolov
LRM
19
2
0
14 May 2024
Addressing Topic Granularity and Hallucination in Large Language Models
  for Topic Modelling
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling
Yida Mu
Peizhen Bai
Kalina Bontcheva
Xingyi Song
33
6
0
01 May 2024
Weight Sparsity Complements Activity Sparsity in Neuromorphic Language
  Models
Weight Sparsity Complements Activity Sparsity in Neuromorphic Language Models
Rishav Mukherji
Mark Schöne
Khaleelulla Khan Nazeer
Christian Mayr
David Kappel
Anand Subramoney
35
2
0
01 May 2024
Concept Induction: Analyzing Unstructured Text with High-Level Concepts
  Using LLooM
Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM
Michelle S. Lam
Janice Teoh
James A. Landay
Jeffrey Heer
Michael S. Bernstein
27
40
0
18 Apr 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
31
1
0
12 Apr 2024
Neural Optimizer Equation, Decay Function, and Learning Rate Schedule
  Joint Evolution
Neural Optimizer Equation, Decay Function, and Learning Rate Schedule Joint Evolution
Brandon Morgan
Dean Frederick Hougen
ODL
23
0
0
10 Apr 2024
Privacy Backdoors: Enhancing Membership Inference through Poisoning
  Pre-trained Models
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Yuxin Wen
Leo Marchyok
Sanghyun Hong
Jonas Geiping
Tom Goldstein
Nicholas Carlini
SILM
AAML
26
9
0
01 Apr 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance
Qi Zhang
Yi Zhou
Shaofeng Zou
27
3
0
01 Apr 2024
A Stochastic Quasi-Newton Method for Non-convex Optimization with
  Non-uniform Smoothness
A Stochastic Quasi-Newton Method for Non-convex Optimization with Non-uniform Smoothness
Zhenyu Sun
Ermin Wei
34
0
0
22 Mar 2024
Multi-Objective Evolutionary Neural Architecture Search for Recurrent
  Neural Networks
Multi-Objective Evolutionary Neural Architecture Search for Recurrent Neural Networks
Reinhard Booysen
Anna Sergeevna Bosman
38
1
0
17 Mar 2024
Authorship Attribution in Bangla Literature (AABL) via Transfer Learning
  using ULMFiT
Authorship Attribution in Bangla Literature (AABL) via Transfer Learning using ULMFiT
Aisha Khatun
Anisur Rahman
Md. Saiful Islam
Hemayet Ahmed Chowdhury
A. Tasnim
24
2
0
08 Mar 2024
A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network
A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network
Ruichen Ma
G. Qiao
Yián Liu
L. Meng
N. Ning
Yang Liu
Shaogang Hu
AAML
MQ
26
3
0
06 Mar 2024
Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with
  Wider Topic Analysis
Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with Wider Topic Analysis
Latifah Almurqren
Ryan Hodgson
A Ioana Cristea
39
3
0
04 Mar 2024
Learning from Teaching Regularization: Generalizable Correlations Should
  be Easy to Imitate
Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate
Can Jin
Tong Che
Hongwu Peng
Yiyuan Li
Dimitris N. Metaxas
Marco Pavone
44
43
0
05 Feb 2024
Automatic channel selection and spatial feature integration for
  multi-channel speech recognition across various array topologies
Automatic channel selection and spatial feature integration for multi-channel speech recognition across various array topologies
Bingshen Mu
Pengcheng Guo
Dake Guo
Pan Zhou
Wei-Neng Chen
Lei Xie
30
2
0
15 Dec 2023
Language Modeling on a SpiNNaker 2 Neuromorphic Chip
Language Modeling on a SpiNNaker 2 Neuromorphic Chip
Khaleelulla Khan Nazeer
Mark Schöne
Rishav Mukherji
Bernhard Vogginger
Christian Mayr
David Kappel
Anand Subramoney
32
5
0
14 Dec 2023
A Unified Sampling Framework for Solver Searching of Diffusion
  Probabilistic Models
A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models
En-hao Liu
Xuefei Ning
Huazhong Yang
Yu Wang
DiffM
31
11
0
12 Dec 2023
Advancing State of the Art in Language Modeling
Advancing State of the Art in Language Modeling
David Herel
Tomáš Mikolov
29
1
0
28 Nov 2023
BEND: Benchmarking DNA Language Models on biologically meaningful tasks
BEND: Benchmarking DNA Language Models on biologically meaningful tasks
Frederikke Isa Marin
Felix Teufel
Marc Horlacher
Dennis Madsen
Dennis Pultz
Ole Winther
Wouter Boomsma
12
33
0
21 Nov 2023
Activity Sparsity Complements Weight Sparsity for Efficient RNN
  Inference
Activity Sparsity Complements Weight Sparsity for Efficient RNN Inference
Rishav Mukherji
Mark Schöne
Khaleelulla Khan Nazeer
Christian Mayr
Anand Subramoney
30
2
0
13 Nov 2023
Parameter-Agnostic Optimization under Relaxed Smoothness
Parameter-Agnostic Optimization under Relaxed Smoothness
Florian Hübler
Junchi Yang
Xiang Li
Niao He
26
12
0
06 Nov 2023
Longer Fixations, More Computation: Gaze-Guided Recurrent Neural
  Networks
Longer Fixations, More Computation: Gaze-Guided Recurrent Neural Networks
Xinting Huang
Jiajing Wan
Ioannis Kritikos
Nora Hollenstein
9
3
0
31 Oct 2023
Out-of-distribution Object Detection through Bayesian Uncertainty
  Estimation
Out-of-distribution Object Detection through Bayesian Uncertainty Estimation
Tianhao Zhang
Shenglin Wang
N. Bouaynaya
R. Calinescu
Lyudmila Mihaylova
OODD
21
2
0
29 Oct 2023
Rethinking SIGN Training: Provable Nonconvex Acceleration without First-
  and Second-Order Gradient Lipschitz
Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Tao Sun
Congliang Chen
Peng Qiao
Li Shen
Xinwang Liu
Dongsheng Li
34
3
0
23 Oct 2023
Controlled Randomness Improves the Performance of Transformer Models
Controlled Randomness Improves the Performance of Transformer Models
Tobias Deuβer
Cong Zhao
Wolfgang Krämer
David Leonhard
Christian Bauckhage
R. Sifa
19
1
0
20 Oct 2023
Prototype of a robotic system to assist the learning process of English
  language with text-generation through DNN
Prototype of a robotic system to assist the learning process of English language with text-generation through DNN
Carlos Morales-Torres
Mario Campos Soberanis
Diego Campos-Sobrino
8
0
0
20 Sep 2023
Machine Learning Technique Based Fake News Detection
Machine Learning Technique Based Fake News Detection
Biplob Kumar Sutradhar
Mohammad Zonaid
Nushrat Jahan Ria
S. R. H. Noori
22
2
0
18 Sep 2023
Differentiable Retrieval Augmentation via Generative Language Modeling
  for E-commerce Query Intent Classification
Differentiable Retrieval Augmentation via Generative Language Modeling for E-commerce Query Intent Classification
Chenyu Zhao
Yunjiang Jiang
Yiming Qiu
Han Zhang
Wen-Yun Yang
RALM
26
5
0
18 Aug 2023
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Denis Kuznedelev
Eldar Kurtic
Eugenia Iofinova
Elias Frantar
Alexandra Peste
Dan Alistarh
VLM
21
11
0
03 Aug 2023
FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated
  Learning with Bayesian Inference-Based Adaptive Dropout
FedBIAD: Communication-Efficient and Accuracy-Guaranteed Federated Learning with Bayesian Inference-Based Adaptive Dropout
Jingjing Xue
Min Liu
Sheng Sun
Yuwei Wang
Hui Jiang
Xue Jiang
15
7
0
14 Jul 2023
Lookaround Optimizer: $k$ steps around, 1 step average
Lookaround Optimizer: kkk steps around, 1 step average
Jiangtao Zhang
Shunyu Liu
Jie Song
Tongtian Zhu
Zhenxing Xu
Mingli Song
MoMe
29
6
0
13 Jun 2023
Revisiting Conversation Discourse for Dialogue Disentanglement
Revisiting Conversation Discourse for Dialogue Disentanglement
Bobo Li
Hao Fei
Fei Li
Shengqiong Wu
Lizi Liao
Yin-wei Wei
Tat-Seng Chua
Donghong Ji
35
1
0
06 Jun 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
27
128
0
23 May 2023
1234...91011
Next