ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.15045
  4. Cited By
Reservoir Transformers
v1v2 (latest)

Reservoir Transformers

Annual Meeting of the Association for Computational Linguistics (ACL), 2020
30 December 2020
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
ArXiv (abs)PDFHTML

Papers citing "Reservoir Transformers"

14 / 14 papers shown
Echo Flow Networks
Echo Flow Networks
Hongbo Liu
Jia Xu
AI4TS
220
0
0
28 Sep 2025
Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics
Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics
Pradeep Singh
Mehak Sharma
Anupriya Dey
Balasubramanian Raman
AI4TS
238
0
0
25 Aug 2025
Perturbative Gradient Training: A novel training paradigm for bridging the gap between deep neural networks and physical reservoir computing
Perturbative Gradient Training: A novel training paradigm for bridging the gap between deep neural networks and physical reservoir computing
Cliff B. Abbott
Mark Elo
Dmytro A. Bozhko
290
1
0
05 Jun 2025
Learning Music Audio Representations With Limited Data
Learning Music Audio Representations With Limited DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Christos Plachouras
Emmanouil Benetos
Johan Pauwels
371
0
0
09 May 2025
Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly Detection
Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly DetectionSDM (SDM), 2025
Long Tan Le
Tung Nguyen
Han Shu
Suranga Seneviratne
Choong Seon Hong
Phuong Vo
359
1
0
14 Mar 2025
Changes by Butterflies: Farsighted Forecasting with Group Reservoir
  Transformer
Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer
Md Kowsher
Abdul Rafae Khan
Jia Xu
431
0
0
14 Feb 2024
Partially Randomizing Transformer Weights for Dialogue Response
  Diversity
Partially Randomizing Transformer Weights for Dialogue Response Diversity
Jing Yang Lee
Kong Aik Lee
Woon-Seng Gan
277
1
0
18 Nov 2023
Unleashing the Power of Pre-trained Language Models for Offline
  Reinforcement Learning
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Ruizhe Shi
Yuyao Liu
Yanjie Ze
Simon S. Du
Huazhe Xu
OffRLRALM
579
36
0
31 Oct 2023
Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for
  Sparse Training
Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse TrainingNeural Information Processing Systems (NeurIPS), 2022
Geng Yuan
Yanyu Li
Sheng Li
Zhenglun Kong
Sergey Tulyakov
Xulong Tang
Yanzhi Wang
Jian Ren
328
21
0
22 Sep 2022
Physical Deep Learning with Biologically Plausible Training Method
Physical Deep Learning with Biologically Plausible Training Method
Mitsumasa Nakajima
Katsuma Inoue
Kenji Tanaka
Yasuo Kuniyoshi
Toshikazu Hashimoto
Kohei Nakajima
AI4CE
219
3
0
01 Apr 2022
How does the pre-training objective affect what large language models
  learn about linguistic properties?
How does the pre-training objective affect what large language models learn about linguistic properties?Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Ahmed Alajrami
Nikolaos Aletras
228
23
0
20 Mar 2022
Efficient and Private Federated Learning with Partially Trainable
  Networks
Efficient and Private Federated Learning with Partially Trainable Networks
Hakim Sidahmed
Zheng Xu
Ankush Garg
Yuan Cao
Mingqing Chen
FedML
342
17
0
06 Oct 2021
What's Hidden in a One-layer Randomly Weighted Transformer?
What's Hidden in a One-layer Randomly Weighted Transformer?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Sheng Shen
Z. Yao
Douwe Kiela
Kurt Keutzer
Michael W. Mahoney
186
6
0
08 Sep 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word
  Matters Pre-training for Little
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for LittleConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Koustuv Sinha
Robin Jia
Dieuwke Hupkes
J. Pineau
Adina Williams
Douwe Kiela
340
283
0
14 Apr 2021
1
Page 1 of 1