Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.11174
Cited By
Linear Transformers Are Secretly Fast Weight Programmers
22 February 2021
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Linear Transformers Are Secretly Fast Weight Programmers"
12 / 162 papers shown
Title
Hybrid Random Features
K. Choromanski
Haoxian Chen
Han Lin
Yuanzhe Ma
Arijit Sehanobish
...
Andy Zeng
Valerii Likhosherstov
Dmitry Kalashnikov
Vikas Sindhwani
Adrian Weller
20
21
0
08 Oct 2021
ABC: Attention with Bounded-memory Control
Hao Peng
Jungo Kasai
Nikolaos Pappas
Dani Yogatama
Zhaofeng Wu
Lingpeng Kong
Roy Schwartz
Noah A. Smith
61
22
0
06 Oct 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
23
3
0
06 Oct 2021
Learning with Holographic Reduced Representations
Ashwinkumar Ganesan
Hang Gao
S. Gandhi
Edward Raff
Tim Oates
James Holt
Mark McLean
11
23
0
05 Sep 2021
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
ViT
23
128
0
26 Aug 2021
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
Kazuki Irie
Imanol Schlag
Róbert Csordás
Jürgen Schmidhuber
26
58
0
11 Jun 2021
Staircase Attention for Recurrent Processing of Sequences
Da Ju
Stephen Roller
Sainbayar Sukhbaatar
Jason Weston
18
11
0
08 Jun 2021
Choose a Transformer: Fourier or Galerkin
Shuhao Cao
39
220
0
31 May 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
269
179
0
17 Feb 2021
Meta Learning Backpropagation And Improving It
Louis Kirsch
Jürgen Schmidhuber
45
56
0
29 Dec 2020
On the Binding Problem in Artificial Neural Networks
Klaus Greff
Sjoerd van Steenkiste
Jürgen Schmidhuber
OCL
224
254
0
09 Dec 2020
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
201
1,367
0
06 Jun 2016
Previous
1
2
3
4