v1v2 (latest)

Reservoir Transformers

Annual Meeting of the Association for Computational Linguistics (ACL), 2020

30 December 2020

Douwe Kiela

Papers citing "Reservoir Transformers"

14 / 14 papers shown

220

28 Sep 2025

Frozen in Time: Parameter-Efficient Time Series Transformers via Reservoir-Induced Feature Expansion and Fixed Random Dynamics

Pradeep Singh

Mehak Sharma

Anupriya Dey

Balasubramanian Raman

AI4TS

238

25 Aug 2025

Perturbative Gradient Training: A novel training paradigm for bridging the gap between deep neural networks and physical reservoir computing

Cliff B. Abbott

Mark Elo

Dmytro A. Bozhko

290

05 Jun 2025

Learning Music Audio Representations With Limited DataIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

Christos Plachouras

Emmanouil Benetos

Johan Pauwels

371

09 May 2025

Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly DetectionSDM (SDM), 2025

359

14 Mar 2025

Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer

Md Kowsher

Abdul Rafae Khan

Jia Xu

431

14 Feb 2024

Partially Randomizing Transformer Weights for Dialogue Response Diversity

Jing Yang Lee

Kong Aik Lee

Woon-Seng Gan

277

18 Nov 2023

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023

579

31 Oct 2023

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse TrainingNeural Information Processing Systems (NeurIPS), 2022

328

22 Sep 2022

Physical Deep Learning with Biologically Plausible Training Method

219

01 Apr 2022

How does the pre-training objective affect what large language models learn about linguistic properties?Annual Meeting of the Association for Computational Linguistics (ACL), 2022

Ahmed Alajrami

Nikolaos Aletras

228

20 Mar 2022

Efficient and Private Federated Learning with Partially Trainable Networks

342

06 Oct 2021

What's Hidden in a One-layer Randomly Weighted Transformer?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Douwe Kiela

186

08 Sep 2021

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for LittleConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Robin Jia

Douwe Kiela

340

283

14 Apr 2021