Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.06981
Cited By
Thinking Like Transformers
13 June 2021
Gail Weiss
Yoav Goldberg
Eran Yahav
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Thinking Like Transformers"
9 / 109 papers shown
Title
Neural Networks and the Chomsky Hierarchy
Grégoire Delétang
Anian Ruoss
Jordi Grau-Moya
Tim Genewein
L. Wenliang
...
Chris Cundy
Marcus Hutter
Shane Legg
Joel Veness
Pedro A. Ortega
UQCV
96
129
0
05 Jul 2022
FloorGenT: Generative Vector Graphic Model of Floor Plans for Robotics
Ludvig Ericson
Patric Jensfelt
3DV
14
2
0
07 Mar 2022
Overcoming a Theoretical Limitation of Self-Attention
David Chiang
Peter A. Cholak
25
76
0
24 Feb 2022
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
AI4CE
25
55
0
14 Oct 2021
Saturated Transformers are Constant-Depth Threshold Circuits
William Merrill
Ashish Sabharwal
Noah A. Smith
12
96
0
30 Jun 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
254
2,012
0
28 Jul 2020
Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack
A. Mali
Alexander Ororbia
Daniel Kifer
C. Lee Giles
6
13
0
04 Apr 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
579
0
12 Mar 2020
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
216
7,923
0
17 Aug 2015
Previous
1
2
3