ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.09913
  4. Cited By
Capacity and Trainability in Recurrent Neural Networks
v1v2v3 (latest)

Capacity and Trainability in Recurrent Neural Networks

29 November 2016
Jasmine Collins
Jascha Narain Sohl-Dickstein
David Sussillo
ArXiv (abs)PDFHTML

Papers citing "Capacity and Trainability in Recurrent Neural Networks"

50 / 99 papers shown
Comparison of neural network training strategies for the simulation of dynamical systems
Comparison of neural network training strategies for the simulation of dynamical systems
Paul Strasser
Andreas Pfeffer
Jakob Weber
Markus Gurtner
Andreas Körner
243
0
0
03 Dec 2025
Hybrid Quantum-Classical Recurrent Neural Networks
Hybrid Quantum-Classical Recurrent Neural Networks
Wenduan Xu
222
1
0
29 Oct 2025
How much do language models memorize?
How much do language models memorize?
John X. Morris
Chawin Sitawarin
Chuan Guo
Narine Kokhlikyan
G. E. Suh
Alexander M. Rush
Kamalika Chaudhuri
Saeed Mahloujifar
KELMELM
472
39
0
30 May 2025
Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain
Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in the Rodent Brain
Trinity Chung
Yuchen Shen
Nathan C. L. Kong
Aran Nayebi
589
1
0
23 May 2025
The impact of allocation strategies in subset learning on the expressive power of neural networks
The impact of allocation strategies in subset learning on the expressive power of neural networksInternational Conference on Learning Representations (ICLR), 2025
Ofir Schlisselberg
Ran Darshan
376
0
0
10 Feb 2025
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Ann Huang
Satpreet H. Singh
Flavio Martinelli
Kanaka Rajan
475
11
0
04 Oct 2024
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective
Jing Xiong
Jianghan Shen
Fanghua Ye
Chaofan Tao
Zhongwei Wan
...
Chuanyang Zheng
Zhijiang Guo
Min Yang
Lingpeng Kong
Ngai Wong
327
9
0
04 Oct 2024
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR
Keyu An
Shiliang Zhang
400
6
0
26 Sep 2023
The minimal computational substrate of fluid intelligence
The minimal computational substrate of fluid intelligenceCortex (Cortex), 2023
Amy Nelson
J. Mole
Guilherme Pombo
Robert J. Gray
James K. Ruffle
E. Chan
Geraint Rees
L. Cipolotti
P. Nachev
216
0
0
14 Aug 2023
Trainability, Expressivity and Interpretability in Gated Neural ODEs
Trainability, Expressivity and Interpretability in Gated Neural ODEsInternational Conference on Machine Learning (ICML), 2023
T. Kim
T. Can
K. Krishnamurthy
AI4CE
464
6
0
12 Jul 2023
Adaptive-saturated RNN: Remember more with less instability
Adaptive-saturated RNN: Remember more with less instability
Khoi Minh Nguyen-Duy
Quang Pham
B. T. Nguyen
ODL
141
2
0
24 Apr 2023
Online Evolutionary Neural Architecture Search for Multivariate
  Non-Stationary Time Series Forecasting
Online Evolutionary Neural Architecture Search for Multivariate Non-Stationary Time Series ForecastingApplied Soft Computing (Appl. Soft Comput.), 2023
Zimeng Lyu
Alexander Ororbia
Travis J. Desell
AI4TS
267
18
0
20 Feb 2023
General-Purpose In-Context Learning by Meta-Learning Transformers
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
540
110
0
08 Dec 2022
Criteria for Classifying Forecasting Methods
Criteria for Classifying Forecasting MethodsInternational Journal of Forecasting (IJF), 2020
Tim Januschowski
Jan Gasthaus
Bernie Wang
David Salinas
Valentin Flunkert
Michael Bohlke-Schneider
Laurent Callot
AI4TS
316
209
0
07 Dec 2022
How Does a Deep Learning Model Architecture Impact Its Privacy? A
  Comprehensive Study of Privacy Attacks on CNNs and Transformers
How Does a Deep Learning Model Architecture Impact Its Privacy? A Comprehensive Study of Privacy Attacks on CNNs and TransformersUSENIX Security Symposium (USENIX Security), 2022
Guangsheng Zhang
B. Liu
Huan Tian
Tianqing Zhu
Ming Ding
Wanlei Zhou
PILMMIACV
345
14
0
20 Oct 2022
Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural
  Networks
Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural NetworksAAAI Conference on Artificial Intelligence (AAAI), 2022
Kentaro Ohno
Sekitoshi Kanai
Yasutoshi Ida
383
1
0
04 Oct 2022
Memory-Augmented Graph Neural Networks: A Brain-Inspired Review
Memory-Augmented Graph Neural Networks: A Brain-Inspired ReviewIEEE Transactions on Artificial Intelligence (IEEE TAI), 2022
Guixiang Ma
Vy A. Vo
Ted Willke
Nesreen Ahmed
212
6
0
22 Sep 2022
TeKo: Text-Rich Graph Neural Networks with External Knowledge
TeKo: Text-Rich Graph Neural Networks with External KnowledgeIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Zhizhi Yu
Di Jin
Jianguo Wei
Ziyang Liu
Yue Shang
Yun Xiao
Jiawei Han
Lingfei Wu
219
6
0
15 Jun 2022
Training neural networks using Metropolis Monte Carlo and an adaptive
  variant
Training neural networks using Metropolis Monte Carlo and an adaptive variant
S. Whitelam
V. Selin
Ian Benlolo
Corneel Casert
Isaac Tamblyn
BDL
319
12
0
16 May 2022
DeepGraviLens: a Multi-Modal Architecture for Classifying Gravitational
  Lensing Data
DeepGraviLens: a Multi-Modal Architecture for Classifying Gravitational Lensing Data
Nicolò Oreste Pinciroli Vago
Piero Fraternali
404
5
0
02 May 2022
ONE-NAS: An Online NeuroEvolution based Neural Architecture Search for
  Time Series Forecasting
ONE-NAS: An Online NeuroEvolution based Neural Architecture Search for Time Series Forecasting
Zimeng Lyu
Travis J. Desell
AI4TS
172
7
0
27 Feb 2022
Intelligent Acoustic Module for Autonomous Vehicles using Fast Gated
  Recurrent approach
Intelligent Acoustic Module for Autonomous Vehicles using Fast Gated Recurrent approach
Raghav Rawat
Shreyash Gupta
Shreyas Mohapatra
S. P. Mishra
Sreesankar Rajagopal
223
2
0
06 Dec 2021
Adaptive First- and Second-Order Algorithms for Large-Scale Machine
  Learning
Adaptive First- and Second-Order Algorithms for Large-Scale Machine Learning
Sanae Lotfi
Tiphaine Bonniot de Ruisselet
D. Orban
Andrea Lodi
ODL
211
1
0
29 Nov 2021
Gradients are Not All You Need
Gradients are Not All You Need
Luke Metz
C. Freeman
S. Schoenholz
Tal Kachman
333
108
0
10 Nov 2021
Understanding How Encoder-Decoder Architectures Attend
Understanding How Encoder-Decoder Architectures AttendNeural Information Processing Systems (NeurIPS), 2021
Kyle Aitken
V. Ramasesh
Yuan Cao
Niru Maheswaranathan
234
28
0
28 Oct 2021
Multi-layer Perceptron Trainability Explained via Variability
Multi-layer Perceptron Trainability Explained via Variability
Yueyao Yu
Yin Zhang
263
5
0
19 May 2021
Is it enough to optimize CNN architectures on ImageNet?
Is it enough to optimize CNN architectures on ImageNet?
Lukas Tuggener
Jürgen Schmidhuber
Thilo Stadelmann
240
30
0
16 Mar 2021
Gated Ensemble of Spatio-temporal Mixture of Experts for Multi-task
  Learning in Ride-hailing System
Gated Ensemble of Spatio-temporal Mixture of Experts for Multi-task Learning in Ride-hailing SystemMultimodal Transportation (MT), 2020
M. Rahman
S. Rifaat
S. N. Sadeek
M. Abrar
D. Wang
644
6
0
31 Dec 2020
Sequence Generation using Deep Recurrent Networks and Embeddings: A
  study case in music
Sequence Generation using Deep Recurrent Networks and Embeddings: A study case in music
Sebastian Garcia-Valencia
Alejandro Betancourt
Juan Guillermo Lalinde Pulido
MGen
144
8
0
02 Dec 2020
Continuous Ant-Based Neural Topology Search
Continuous Ant-Based Neural Topology Search
A. ElSaid
Joshua Karns
Zimeng Lyu
Alexander Ororbia
Travis J. Desell
227
5
0
21 Nov 2020
Low-Dimensional Manifolds Support Multiplexed Integrations in Recurrent
  Neural Networks
Low-Dimensional Manifolds Support Multiplexed Integrations in Recurrent Neural NetworksNeural Computation (Neural Comput.), 2020
Arnaud Fanthomme
R. Monasson
204
6
0
20 Nov 2020
Underspecification Presents Challenges for Credibility in Modern Machine
  Learning
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Alexander DÁmour
Katherine A. Heller
D. Moldovan
Ben Adlam
B. Alipanahi
...
Kellie Webster
Steve Yadlowsky
T. Yun
Xiaohua Zhai
D. Sculley
OffRL
597
830
0
06 Nov 2020
The geometry of integration in text classification RNNs
The geometry of integration in text classification RNNsInternational Conference on Learning Representations (ICLR), 2020
Kyle Aitken
V. Ramasesh
Ankush Garg
Yuan Cao
David Sussillo
Niru Maheswaranathan
AI4CE
249
16
0
28 Oct 2020
Learnability and Complexity of Quantum Samples
Learnability and Complexity of Quantum Samples
M. Niu
Andrew M. Dai
Li Li
Augustus Odena
Zhengli Zhao
Vadim N. Smelyanskyi
Hartmut Neven
Sergio Boixo
114
14
0
22 Oct 2020
Unfolding recurrence by Green's functions for optimized reservoir
  computing
Unfolding recurrence by Green's functions for optimized reservoir computing
Sandra Nestler
Christian Keup
David Dahmen
M. Gilson
Holger Rauhut
M. Helias
227
4
0
13 Oct 2020
RNN Training along Locally Optimal Trajectories via Frank-Wolfe
  Algorithm
RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm
Yun Yue
Ming Li
Venkatesh Saligrama
Ziming Zhang
372
5
0
12 Oct 2020
GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep
  Learning
GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep LearningACM Symposium on Applied Computing (SAC), 2020
Vasisht Duddu
A. Boutet
Virat Shejwalkar
GNN
256
4
0
02 Oct 2020
An Experimental Study of Weight Initialization and Weight Inheritance
  Effects on Neuroevolution
An Experimental Study of Weight Initialization and Weight Inheritance Effects on Neuroevolution
Zimeng Lyu
A. ElSaid
Joshua Karns
Mohamed Wiem Mkaouer
Travis J. Desell
ODL
286
3
0
21 Sep 2020
Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An
  Information-Theoretic Framework
Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An Information-Theoretic FrameworkIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2020
Qi Tan
Yang Liu
Jiming Liu
AI4TS
314
9
0
14 Sep 2020
Shuffling Recurrent Neural Networks
Shuffling Recurrent Neural NetworksAAAI Conference on Artificial Intelligence (AAAI), 2020
Michael Rotman
Lior Wolf
BDL
250
36
0
14 Jul 2020
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
Xun Yang
Jianfeng Dong
Yixin Cao
Xun Wang
Meng Wang
Tat-Seng Chua
304
151
0
06 Jul 2020
Thalamocortical motor circuit insights for more robust hierarchical
  control of complex sequences
Thalamocortical motor circuit insights for more robust hierarchical control of complex sequences
Laureline Logiaco
G. S. Escola
110
5
0
23 Jun 2020
Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks
  through Network-Aware Adaptation
Neuroevolutionary Transfer Learning of Deep Recurrent Neural Networks through Network-Aware Adaptation
A. ElSaid
Joshua Karns
Alexander Ororbia
Daniel E. Krutz
Zimeng Lyu
Travis J. Desell
189
0
0
04 Jun 2020
A Tree Architecture of LSTM Networks for Sequential Regression with
  Missing Data
A Tree Architecture of LSTM Networks for Sequential Regression with Missing Data
S. O. Sahin
Suleyman S. Kozat
129
2
0
22 May 2020
Improving Neuroevolution Using Island Extinction and Repopulation
Improving Neuroevolution Using Island Extinction and Repopulation
Zimeng Lyu
Joshua Karns
A. ElSaid
Travis J. Desell
160
5
0
15 May 2020
Quantitative Analysis of Image Classification Techniques for
  Memory-Constrained Devices
Quantitative Analysis of Image Classification Techniques for Memory-Constrained Devices
S. Müksch
Theo X. Olausson
John Wilhelm
Pavlos Andreadis
263
2
0
11 May 2020
How recurrent networks implement contextual processing in sentiment
  analysis
How recurrent networks implement contextual processing in sentiment analysisInternational Conference on Machine Learning (ICML), 2020
Niru Maheswaranathan
David Sussillo
196
25
0
17 Apr 2020
TraDE: Transformers for Density Estimation
TraDE: Transformers for Density Estimation
Rasool Fakoor
Pratik Chaudhari
Jonas W. Mueller
Alex Smola
326
32
0
06 Apr 2020
Actor-Transformers for Group Activity Recognition
Actor-Transformers for Group Activity RecognitionComputer Vision and Pattern Recognition (CVPR), 2020
Kirill Gavrilyuk
Ryan Sanford
Mehrsan Javan
Cees G. M. Snoek
ViT
245
219
0
28 Mar 2020
CHAMELEON: A Deep Learning Meta-Architecture for News Recommender
  Systems [Phd. Thesis]
CHAMELEON: A Deep Learning Meta-Architecture for News Recommender Systems [Phd. Thesis]
Gabriel de Souza Pereira Moreira
GNN
255
2
0
29 Dec 2019
12
Next
Page 1 of 2