Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2108.12284
Cited By
v1
v2
v3
v4 (latest)
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
26 August 2021
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers"
50 / 104 papers shown
Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws
Egor Petrov
Nikita Kiselev
Vladislav Meshkov
Andrey Grabovoy
123
0
0
19 Oct 2025
Learning neuro-symbolic convergent term rewriting systems
Flavio Petruzzellis
Alberto Testolin
A. Sperduti
NAI
125
0
0
25 Jul 2025
Scaling can lead to compositional generalization
Florian Redhardt
Yassir Akram
Simon Schug
GNN
CoGe
204
0
0
09 Jul 2025
Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ivan Vegner
Sydelle de Souza
Valentin Forch
Martha Lewis
Leonidas A.A. Doumas
230
3
0
04 Jun 2025
Characterizing Pattern Matching and Its Limits on Compositional Task Structures
Hoyeon Chang
Jinho Park
Hanseul Cho
Sohee Yang
Miyoung Ko
Hyeonbin Hwang
Seungpil Won
Dohaeng Lee
Youbin Ahn
Minjoon Seo
278
1
0
26 May 2025
TRACE for Tracking the Emergence of Semantic Representations in Transformers
Nura Aljaafari
Danilo S. Carvalho
André Freitas
240
1
0
23 May 2025
Comparison of Different Deep Neural Network Models in the Cultural Heritage Domain
Teodor Boyadzhiev
Gabriele Lagani
Luca Ciampi
Giuseppe Amato
Krassimira Ivanova
VLM
229
1
0
30 Apr 2025
Exploring Compositional Generalization (in COGS/ReCOGS_pos) by Transformers using Restricted Access Sequence Processing (RASP)
William Bruns
609
0
0
21 Apr 2025
Context-aware Biases for Length Extrapolation
Ali Veisi
Hamidreza Amirzadeh
Amir Mansourian
563
2
0
11 Mar 2025
Structural Deep Encoding for Table Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Raphael Mouravieff
Benjamin Piwowarski
Sylvain Lamprier
LMTD
277
2
0
03 Mar 2025
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
237
6
0
24 Feb 2025
Analyzing the Inner Workings of Transformers in Compositional Generalization
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Ryoma Kumon
Hitomi Yanaka
327
1
0
24 Feb 2025
Compositional Generalization Across Distributional Shifts with Sparse Tree Operations
Neural Information Processing Systems (NeurIPS), 2024
Paul Soulos
Henry Conklin
Mattia Opper
P. Smolensky
Jianfeng Gao
Roland Fernandez
315
6
0
18 Dec 2024
Quantifying artificial intelligence through algorithmic generalization
Nature Machine Intelligence (Nat. Mach. Intell.), 2024
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
442
0
0
08 Nov 2024
Overcoming classic challenges for artificial neural networks by providing incentives and practice
Nature Machine Intelligence (Nat. Mach. Intell.), 2024
Kazuki Irie
Brenden M. Lake
577
8
0
14 Oct 2024
Adaptive Prediction Ensemble: Improving Out-of-Distribution Generalization of Motion Forecasting
Jinning Li
Jiachen Li
Sangjae Bae
David Isele
273
7
0
12 Jul 2024
Teaching Transformers Causal Reasoning through Axiomatic Training
Aniket Vashishtha
Abhinav Kumar
Atharva Pandey
Abbavaram Gowtham Reddy
Amit Sharma
Vineeth N. Balasubramanian
Amit Sharma
413
8
0
10 Jul 2024
Are there identifiable structural parts in the sentence embedding whole?
Vivi Nastase
Paola Merlo
198
6
0
24 Jun 2024
Evaluating Structural Generalization in Neural Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Ryoma Kumon
Daiki Matsuoka
Hitomi Yanaka
NAI
199
2
0
19 Jun 2024
On the Minimal Degree Bias in Generalization on the Unseen for non-Boolean Functions
Denys Pushkin
Raphael Berthier
Emmanuel Abbe
206
0
0
10 Jun 2024
MoEUT: Mixture-of-Experts Universal Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
Christopher Potts
Christopher D. Manning
MoE
247
28
0
25 May 2024
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
Jacob Russin
Sam Whitman McGrath
Danielle J. Williams
AI4CE
516
6
0
24 May 2024
Philosophy of Cognitive Science in the Age of Deep Learning
Raphaël Millière
AI4CE
NAI
219
8
0
07 May 2024
What makes Models Compositional? A Theoretical View: With Supplement
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Parikshit Ram
Tim Klinger
Alexander G. Gray
CoGe
277
8
0
02 May 2024
Setting up the Data Printer with Improved English to Ukrainian Machine Translation
Yurii Paniv
Dmytro Chaplynskyi
Nikita Trynus
Volodymyr Kyrylov
AI4CE
266
3
0
23 Apr 2024
Sequential Compositional Generalization in Multimodal Models
Semih Yagcioglu
Osman Batur .Ince
Aykut Erdem
Erkut Erdem
Desmond Elliott
Deniz Yuret
195
1
0
18 Apr 2024
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory
Hung Le
D. Nguyen
Kien Do
Svetha Venkatesh
T. Tran
204
0
0
18 Apr 2024
Towards Understanding the Relationship between In-context Learning and Compositional Generalization
International Conference on Language Resources and Evaluation (LREC), 2024
Sungjun Han
Sebastian Padó
CoGe
210
5
0
18 Mar 2024
A Neural Rewriting System to Solve Algorithmic Problems
Flavio Petruzzellis
Alberto Testolin
A. Sperduti
NAI
244
2
0
27 Feb 2024
Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies
Flavio Petruzzellis
Alberto Testolin
A. Sperduti
ELM
313
14
0
27 Feb 2024
Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Yichen Jiang
Xiang Zhou
Mohit Bansal
274
1
0
09 Feb 2024
Limits of Transformer Language Models on Learning to Compose Algorithms
Jonathan Thomm
Aleksandar Terzić
Giacomo Camposampiero
Michael Hersche
Bernhard Schölkopf
Abbas Rahimi
478
11
0
08 Feb 2024
On the generalization capacity of neural networks during generic multimodal reasoning
International Conference on Learning Representations (ICLR), 2024
Takuya Ito
Soham Dan
Mattia Rigotti
James Kozloski
Murray Campbell
LRM
229
4
0
26 Jan 2024
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
International Conference on Machine Learning (ICML), 2023
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
325
14
0
21 Nov 2023
Attribute Diversity Determines the Systematicity Gap in VQA
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ian Berlot-Attwell
Kumar Krishna Agrawal
A. M. Carrell
Yash Sharma
Naomi Saphra
254
2
0
15 Nov 2023
Data Factors for Better Compositional Generalization
Xiang Zhou
Yichen Jiang
Mohit Bansal
CoGe
OOD
190
7
0
08 Nov 2023
Syntax-Guided Transformers: Elevating Compositional Generalization and Grounding in Multimodal Environments
Danial Kamali
Parisa Kordjamshidi
210
1
0
07 Nov 2023
The Impact of Depth on Compositional Generalization in Transformer Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jackson Petty
Sjoerd van Steenkiste
Ishita Dasgupta
Fei Sha
Daniel H Garrette
Tal Linzen
AI4CE
VLM
313
30
0
30 Oct 2023
SLOG: A Structural Generalization Benchmark for Semantic Parsing
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Bingzhi Li
L. Donatelli
Alexander Koller
Tal Linzen
Yuekun Yao
Najoung Kim
196
19
0
23 Oct 2023
Structural generalization in COGS: Supertagging is (almost) all you need
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alban Petit
Caio Corro
François Yvon
NAI
192
1
0
21 Oct 2023
Harnessing Dataset Cartography for Improved Compositional Generalization in Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Osman Batur .Ince
Tanin Zeraati
Semih Yagcioglu
Yadollah Yaghoobzadeh
Erkut Erdem
Aykut Erdem
164
3
0
18 Oct 2023
Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Samira Abnar
Omid Saremi
Laurent Dinh
Shantel Wilson
Miguel Angel Bautista
...
Vimal Thilak
Etai Littwin
Jiatao Gu
Josh Susskind
Samy Bengio
331
8
0
13 Oct 2023
Sparse Universal Transformer
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shawn Tan
Songlin Yang
Zhenfang Chen
Aaron Courville
Chuang Gan
MoE
260
24
0
11 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
Anna Langedijk
Hosein Mohebbi
Gabriele Sarti
Willem H. Zuidema
Jaap Jumelet
242
15
0
05 Oct 2023
Compositional Program Generation for Few-Shot Systematic Generalization
Tim Klinger
Luke Liu
Soham Dan
A. Rezaee
Parikshit Ram
Ali Movaghar
NAI
215
4
0
28 Sep 2023
Efficient Benchmarking of Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yotam Perlitz
Elron Bandel
Ariel Gera
Ofir Arviv
L. Ein-Dor
Eyal Shnarch
Noam Slonim
Michal Shmueli-Scheuer
Leshem Choshen
ALM
515
38
0
22 Aug 2023
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
International Conference on Learning Representations (ICLR), 2023
Kensen Shi
Joey Hong
Yinlin Deng
Pengcheng Yin
Manzil Zaheer
Charles Sutton
184
20
0
26 Jul 2023
A Hybrid System for Systematic Generalization in Simple Arithmetic Problems
International Workshop on Neural-Symbolic Learning and Reasoning (NeSy), 2023
Flavio Petruzzellis
Alberto Testolin
A. Sperduti
AIMat
LRM
186
1
0
29 Jun 2023
Towards Robust Aspect-based Sentiment Analysis through Non-counterfactual Augmentations
Xinyu Liu
Yanl Ding
Kaikai An
Chunyang Xiao
Pranava Madhyastha
Tong Xiao
Jingbo Zhu
159
2
0
24 Jun 2023
Differentiable Tree Operations Promote Compositional Generalization
International Conference on Machine Learning (ICML), 2023
Paul Soulos
J. E. Hu
Kate McCurdy
Yunmo Chen
Roland Fernandez
P. Smolensky
Jianfeng Gao
AI4CE
137
7
0
01 Jun 2023
1
2
3
Next