Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.01895
Cited By
Explicitly Encoding Structural Symmetry is Key to Length Generalization in Arithmetic Tasks
4 June 2024
Mahdi Sabbaghi
George Pappas
Hamed Hassani
Surbhi Goel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Explicitly Encoding Structural Symmetry is Key to Length Generalization in Arithmetic Tasks"
9 / 9 papers shown
Title
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
37
0
0
24 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
90
0
0
04 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
70
4
0
03 Feb 2025
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng
Yihang Gao
Han Shi
Jing Xiong
Jiankai Sun
...
Xiaozhe Ren
Michael Ng
Xin Jiang
Zhenguo Li
Yu Li
26
1
0
07 Oct 2024
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks
Sadegh Mahdavi
Kevin Swersky
Thomas Kipf
Milad Hashemi
Christos Thrampoulidis
Renjie Liao
LRM
OOD
NAI
40
26
0
01 Nov 2022
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Yuxuan Li
James L. McClelland
21
17
0
02 Oct 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
237
690
0
27 Aug 2021
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
190
1,358
0
06 Jun 2016
1