v1v2v3 (latest)

Investigating the Limitations of Transformers with Simple Arithmetic Tasks

25 February 2021

ArXiv (abs)PDF HTML Github (38★)

Papers citing "Investigating the Limitations of Transformers with Simple Arithmetic Tasks"

50 / 106 papers shown

Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

Yoshinari Fujinuma

ELM

139

21 Oct 2025

To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models

211

16 Oct 2025

Efficient numeracy in language models through single-token number embeddings

149

08 Oct 2025

The Art of Breaking Words: Rethinking Multilingual Tokenizer Design

Maunendra Sankar Desarkar

Ganesh Ramakrishnan

240

03 Aug 2025

Long-Short Alignment for Effective Long-Context Modeling in LLMs

231

13 Jun 2025

Is Random Attention Sufficient for Sequence Modeling? Disentangling Trainable Components in the Transformer

547

01 Jun 2025

Recursive Decomposition with Dependencies for Generic Divide-and-Conquer Reasoning

Sergio Hernández-Gutiérrez

319

05 May 2025

A Survey on Mathematical Reasoning and Optimization with Large Language Models

Ali Forootani

OffRL LRM AI4CE

388

22 Mar 2025

SuperBPE: Space Travel for Language Models

596

17 Mar 2025

Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study

Wei Wei

Yue-Jiao Gong

Jun Zhang

Ting Huang

Jun Zhang

368

11 Mar 2025

Simulating the Real World: A Unified Survey of Multimodal Generative Models

255

06 Mar 2025

The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs

453

27 Feb 2025

Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization

456

25 Feb 2025

Reasoning with Latent Thoughts: On the Power of Looped TransformersInternational Conference on Learning Representations (ICLR), 2025

678

113

24 Feb 2025

Int2Int: a framework for mathematics with transformers

François Charton

ViT

479

22 Feb 2025

Learning the symmetric group: large from small

248

18 Feb 2025

Mathematical Language Models: A Survey

...

689

03 Jan 2025

Quantifying artificial intelligence through algorithmic generalizationNature Machine Intelligence (Nat. Mach. Intell.), 2024

502

08 Nov 2024

PatternBoost: Constructions in Mathematics with a Little Help from AI

226

01 Nov 2024

How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

364

17 Oct 2024

Language Models Encode Numbers Using Digit Representations in Base 10North American Chapter of the Association for Computational Linguistics (NAACL), 2024

Amit Arnold Levy

Mor Geva

349

15 Oct 2024

Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformersNeural Information Processing Systems (NeurIPS), 2024

Alberto Alfarano

François Charton

Amaury Hayat

268

10 Oct 2024

MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks

296

08 Oct 2024

RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals

388

06 Oct 2024

Scaling Behavior for Large Language Models regarding Numeral Systems: An Example using PythiaConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Zhejian Zhou

Jiayu Wang

Dahua Lin

Kai Chen

LRM

297

25 Sep 2024

Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts

Wieland Brendel

321

09 Sep 2024

Interpreting and Improving Large Language Models in Arithmetic CalculationInternational Conference on Machine Learning (ICML), 2024

Wei Zhang

Chaoqun Wan

Yonggang Zhang

Yiu-ming Cheung

Xinmei Tian

Xu Shen

Jieping Ye

LRM

401

03 Sep 2024

Learning the Simplicity of Scattering AmplitudesSciPost Physics (SciPost Phys.), 2024

Clifford Cheung

Aurélien Dersy

Matthew D. Schwartz

345

08 Aug 2024

Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks

320

25 Jul 2024

The Extrapolation Power of Implicit Models

247

19 Jul 2024

Numbers Matter! Bringing Quantity-awareness to Retrieval Systems

Satya Almasian

Milena Bruseva

Michael Gertz

253

14 Jul 2024

Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning

248

05 Jul 2024

Tools Fail: Detecting Silent Errors in Faulty Tools

389

27 Jun 2024

Less can be more for predicting properties with large language models

Nawaf Alampara

Santiago Miret

Kevin Maik Jablonka

494

25 Jun 2024

Pre-trained Large Language Models Use Fourier Features to Compute Addition

329

05 Jun 2024

Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

346

05 Jun 2024

Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks

Andrew Gambardella

Yusuke Iwasawa

Yutaka Matsuo

LRM

227

04 Jun 2024

Arbitrary-Length Generalization for Addition in a Tiny Transformer

A. G. Patriota

191

31 May 2024

Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

Jian-Qiao Zhu

Haijiang Yan

Thomas Griffiths

385

29 May 2024

Disentangling and Integrating Relational and Sensory Information in Transformer Architectures

Awni Altabaa

John Lafferty

361

26 May 2024

Models That Prove Their Own Correctness

578

24 May 2024

Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory

395

09 May 2024

Position: Understanding LLMs Requires More Than Statistical GeneralizationInternational Conference on Machine Learning (ICML), 2024

Wieland Brendel

411

03 May 2024

Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark

303

25 Apr 2024

Laying Anchors: Semantically Priming Numerals in Language Modeling

Mandar Sharma

Rutuja Murlidhar Taware

Pravesh Koirala

Nikhil Muralidhar

Naren Ramakrishnan

372

02 Apr 2024

A Neuro-Symbolic Approach to Monitoring Salt Content in Food

417

01 Apr 2024

A Theory for Length Generalization in Learning to Reason

Changnan Xiao

Bing Liu

LRM

396

31 Mar 2024

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks

Yanghua Xiao

218

14 Mar 2024

tsGT: Stochastic Time Series Modeling With Transformer

Marta Emilia Nowakowska

Lukasz Kaiser

Piotr Milo's

314

08 Mar 2024

RORA: Robust Free-Text Rationale Evaluation

Daniel Khashabi

317

28 Feb 2024