Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.04515
Cited By
A Transformer with Stack Attention
7 May 2024
Jiaoda Li
Jennifer C. White
Mrinmaya Sachan
Ryan Cotterell
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Transformer with Stack Attention"
6 / 6 papers shown
Title
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Zhenheng Tang
Xiang Liu
Qian Wang
Peijie Dong
Bingsheng He
Xiaowen Chu
Bo Li
LRM
53
1
0
24 Feb 2025
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
34
1
0
01 Feb 2024
A Logic for Expressing Log-Precision Transformers
William Merrill
Ashish Sabharwal
ReLM
NAI
LRM
48
46
0
06 Oct 2022
Neural Networks and the Chomsky Hierarchy
Grégoire Delétang
Anian Ruoss
Jordi Grau-Moya
Tim Genewein
L. Wenliang
...
Chris Cundy
Marcus Hutter
Shane Legg
Joel Veness
Pedro A. Ortega
UQCV
94
129
0
05 Jul 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
242
695
0
27 Aug 2021
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
196
1,363
0
06 Jun 2016
1