Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.01749
Cited By
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns
3 October 2023
Brian DuSell
David Chiang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
6 / 6 papers shown
Title
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
39
0
0
29 Mar 2025
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
36
2
0
11 Nov 2024
Finding path and cycle counting formulae in graphs with Deep Reinforcement Learning
Jason Piquenot
Maxime Bérar
Pierre Héroux
Jean-Yves Ramel
R. Raveaux
Sébastien Adam
16
0
0
02 Oct 2024
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
34
1
0
01 Feb 2024
The Surprising Computational Power of Nondeterministic Stack RNNs
Brian DuSell
David Chiang
LRM
31
4
0
04 Oct 2022
Transformers Generalize Linearly
Jackson Petty
Robert Frank
AI4CE
208
16
0
24 Sep 2021
1