v1v2v3v4 (latest)

State-of-the-art generalisation research in NLP: A taxonomy and review

Nature Machine Intelligence (Nat. Mach. Intell.), 2022

6 October 2022

Verna Dankers

Christos Christodoulopoulos

Khuyagbaatar Batsuren

ArXiv (abs)PDF HTML Github

Papers citing "State-of-the-art generalisation research in NLP: A taxonomy and review"

50 / 77 papers shown

Z-Space: A Multi-Agent Tool Orchestration Framework for Enterprise-Grade LLM Automation

261

23 Nov 2025

On the Measure of a Model: From Intelligence to Generality

158

14 Nov 2025

Lightweight CNN Model Hashing with Higher-Order Statistics and Chaotic Mapping for Piracy Detection and Tamper Localization

Kunming Yang

Ling Chen

AAML

107

31 Oct 2025

MERGE: Minimal Expression-Replacement GEneralization Test for Natural Language Inference

Mădălina Zgreabăn

Tejaswini Deoskar

Lasha Abzianidze

184

28 Oct 2025

Community size rather than grammatical complexity better predicts Large Language Model accuracy in a novel Wug Test

164

14 Oct 2025

FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks

176

13 Oct 2025

Mapping Semantic & Syntactic Relationships with Geometric Rotation

Michael Freenor

Lauren Alvarez

LLMSV

247

10 Oct 2025

Hybrid Models for Natural Language Reasoning: The Case of Syllogistic Logic

10 Oct 2025

MoVa: Towards Generalizable Classification of Human Morals and Values

146

29 Sep 2025

AgentCoMa: A Compositional Benchmark Mixing Commonsense and Mathematical Reasoning in Real-World Scenarios

204

27 Aug 2025

Numerical models outperform AI weather forecasts of record-breaking extremes

303

21 Aug 2025

How Causal Abstraction Underpins Computational Explanation

Atticus Geiger

Jacqueline Harding

Thomas Icard

180

15 Aug 2025

Are Knowledge and Reference in Multilingual Language Models Cross-Lingually Consistent?

Xi Ai

Mahardika Krisna Ihsani

Min-Yen Kan

HILM

275

17 Jul 2025

Assessing Intersectional Bias in Representations of Pre-Trained Image Recognition Models

Valerie Krug

Sebastian Stober

340

04 Jun 2025

Systematic Generalization in Language Models Scales with Information EntropyAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Sondre Wold

Lucas Georges Gabriel Charpentier

Étienne Simon

485

19 May 2025

Domain Regeneration: How well do LLMs match syntactic properties of text domains?Annual Meeting of the Association for Computational Linguistics (ACL), 2025

423

12 May 2025

FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation

577

24 Apr 2025

FinNLI: Novel Dataset for Multi-Genre Financial Natural Language Inference BenchmarkingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

444

22 Apr 2025

MiMu: Mitigating Multiple Shortcut Learning Behavior of Transformers

496

14 Apr 2025

MultiLoKo: a multilingual local knowledge benchmark for LLMs spanning 31 languages

Dieuwke Hupkes

Nikolay Bogoychev

1.1K

14 Apr 2025

TRA: Better Length Generalisation with Threshold Relative Attention

641

29 Mar 2025

Probing LLMs for Multilingual Discourse Generalization Through a Unified Label SetAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

496

13 Mar 2025

Structural Deep Encoding for Table Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

332

03 Mar 2025

Gradient-Guided Annealing for Domain GeneralizationComputer Vision and Pattern Recognition (CVPR), 2025

Aristotelis Ballas

Christos Diou

OOD

1.5K

27 Feb 2025

Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

468

26 Feb 2025

Learning Latent Spaces for Domain Generalization in Time Series Forecasting

Songgaojun Deng

Maarten de Rijke

CML AI4TS OOD BDL

407

15 Dec 2024

Quantifying artificial intelligence through algorithmic generalizationNature Machine Intelligence (Nat. Mach. Intell.), 2024

504

08 Nov 2024

Beyond the Numbers: Transparency in Relation Extraction Benchmark Creation and Leaderboards

Varvara Arzt

Allan Hanbury

375

07 Nov 2024

Frequency matters: Modeling irregular morphological patterns in Spanish with TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Akhilesh Kakolu Ramarao

Kevin Tang

Dinah Baer-Henney

421

28 Oct 2024

Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5

Thao Anh Dang

Limor Raviv

Lukas Galke

430

15 Oct 2024

The Mystery of Compositional Generalization in Graph-based Generative Commonsense ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Xiyan Fu

Anette Frank

LRM

518

08 Oct 2024

Data Contamination Report from the 2024 CONDA Shared Task

Iker García-Ferrero

...

Yu-Min Tseng

338

31 Jul 2024

Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks

368

04 Jul 2024

Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

357

02 Jul 2024

Detection and Measurement of Syntactic Templates in Generated Text

Chantal Shaib

Yanai Elazar

Junyi Jessy Li

Byron C. Wallace

313

28 Jun 2024

Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence

Yu Ying Chiu

Shane Steinert-Threlkeld

268

24 May 2024

From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks

675

24 May 2024

Evaluating Subword Tokenization: Alien Subword Composition and OOV Generalization Challenge

Khuyagbaatar Batsuren

Ekaterina Vylomova

Verna Dankers

Tsetsuukhei Delgerbaatar

Omri Uzan

Yuval Pinter

Gábor Bella

210

20 Apr 2024

From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency

340

18 Apr 2024

Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly

352

06 Apr 2024

From Robustness to Improved Generalization and Calibration in Pre-trained Language Models

Josip Jukić

Jan Snajder

441

31 Mar 2024

THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation

452

119

13 Feb 2024

Efficient and Scalable Fine-Tune of Language Models for Genome Understanding

144

12 Feb 2024

A Philosophical Introduction to Language Models -- Part I: Continuity With Classic Debates

Raphael Milliere

Cameron Buckner

LRM ELM

261

08 Jan 2024

The ICL Consistency Test

370

08 Dec 2023

Walking a Tightrope -- Evaluating Large Language Models in High-Risk Domains

Carolin (Haas) Lawrence

AILaw ALM ELM

349

25 Nov 2023

GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding

Andor Diera

Abdelhalim Hafedh Dahou

196

16 Nov 2023

Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness

325

16 Nov 2023

On Using Distribution-Based Compositionality Assessment to Evaluate Compositional Generalisation in Machine Translation

291

14 Nov 2023

Robust Generalization Strategies for Morpheme Glossing in an Endangered Language Documentation Context

Michael Ginn

Alexis Palmer

285

05 Nov 2023