ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.14737
  4. Cited By
Positional Description Matters for Transformers Arithmetic

Positional Description Matters for Transformers Arithmetic

22 November 2023
Ruoqi Shen
Sébastien Bubeck
Ronen Eldan
Yin Tat Lee
Yuanzhi Li
Yi Zhang
ArXivPDFHTML

Papers citing "Positional Description Matters for Transformers Arithmetic"

36 / 36 papers shown
Title
Context-aware Biases for Length Extrapolation
Ali Veisi
Amir Mansourian
50
0
0
11 Mar 2025
The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs
The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs
Tanja Baeumel
Josef van Genabith
Simon Ostermann
LRM
50
1
0
27 Feb 2025
The Role of Sparsity for Length Generalization in Transformers
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
37
0
0
24 Feb 2025
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning
Jean Vassoyan
Nathanaël Beau
Roman Plaud
OffRL
90
1
0
10 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
66
4
0
03 Feb 2025
Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating
  Financial Large Language Models
Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models
Xiaojun Wu
Junxi Liu
Huanyi Su
Zhouchi Lin
Yiyan Qi
...
Fuwei Wang
Saizhuo Wang
Fengrui Hua
Jia Li
Jian Guo
45
0
0
09 Nov 2024
Quantifying artificial intelligence through algebraic generalization
Quantifying artificial intelligence through algebraic generalization
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
46
0
0
08 Nov 2024
Provable Length Generalization in Sequence Prediction via Spectral
  Filtering
Provable Length Generalization in Sequence Prediction via Spectral Filtering
Annie Marsden
Evan Dogariu
Naman Agarwal
Xinyi Chen
Daniel Suo
Elad Hazan
32
1
0
01 Nov 2024
On Positional Bias of Faithfulness for Long-form Summarization
On Positional Bias of Faithfulness for Long-form Summarization
David Wan
Jesse Vig
Mohit Bansal
Shafiq R. Joty
HILM
43
3
0
31 Oct 2024
How Numerical Precision Affects Mathematical Reasoning Capabilities of
  LLMs
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
Z. Li
Liwei Wang
LRM
27
5
0
17 Oct 2024
Automated Rewards via LLM-Generated Progress Functions
Automated Rewards via LLM-Generated Progress Functions
Vishnu Sarukkai
Brennan Shacklett
Zander Majercik
Kush S. Bhatia
Christopher Ré
Kayvon Fatahalian
26
1
0
11 Oct 2024
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory
  Waveform Estimation from PPG Signals
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals
Yuyang Miao
Zehua Chen
C. Li
Danilo P. Mandic
DiffM
MedIm
23
0
0
06 Oct 2024
Positional Attention: Expressivity and Learnability of Algorithmic Computation
Positional Attention: Expressivity and Learnability of Algorithmic Computation
Artur Back de Luca
George Giapitzakis
Shenghao Yang
Petar Veličković
K. Fountoulakis
37
0
0
02 Oct 2024
Positional Description for Numerical Normalization
Positional Description for Numerical Normalization
Deepanshu Gupta
Javier Latorre
3DGS
19
0
0
22 Aug 2024
Your Context Is Not an Array: Unveiling Random Access Limitations in
  Transformers
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers
MohammadReza Ebrahimi
Sunny Panchal
Roland Memisevic
25
5
0
10 Aug 2024
Universal Length Generalization with Turing Programs
Universal Length Generalization with Turing Programs
Kaiying Hou
David Brandfonbrener
Sham Kakade
Samy Jelassi
Eran Malach
29
7
0
03 Jul 2024
The SIFo Benchmark: Investigating the Sequential Instruction Following
  Ability of Large Language Models
The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models
Xinyi Chen
Baohao Liao
Jirui Qi
Panagiotis Eustratiadis
Christof Monz
Arianna Bisazza
Maarten de Rijke
ALM
ELM
LRM
23
5
0
28 Jun 2024
MatText: Do Language Models Need More than Text & Scale for Materials
  Modeling?
MatText: Do Language Models Need More than Text & Scale for Materials Modeling?
Nawaf Alampara
Santiago Miret
K. Jablonka
43
8
0
25 Jun 2024
LLMs Are Prone to Fallacies in Causal Inference
LLMs Are Prone to Fallacies in Causal Inference
Nitish Joshi
Abulhair Saparov
Yixin Wang
He He
32
9
0
18 Jun 2024
Transformers meet Neural Algorithmic Reasoners
Transformers meet Neural Algorithmic Reasoners
Wilfried Bounsi
Borja Ibarz
Andrew Dudzik
Jessica B. Hamrick
Larisa Markeeva
Alex Vitvitskyi
Razvan Pascanu
Petar Veličković
NAI
AI4CE
LRM
31
5
0
13 Jun 2024
The CLRS-Text Algorithmic Reasoning Language Benchmark
The CLRS-Text Algorithmic Reasoning Language Benchmark
Larisa Markeeva
Sean McLeish
Borja Ibarz
Wilfried Bounsi
Olga Kozlova
Alex Vitvitskyi
Charles Blundell
Tom Goldstein
Avi Schwarzschild
Petar Veličković
LRM
28
12
0
06 Jun 2024
Explicitly Encoding Structural Symmetry is Key to Length Generalization
  in Arithmetic Tasks
Explicitly Encoding Structural Symmetry is Key to Length Generalization in Arithmetic Tasks
Mahdi Sabbaghi
George Pappas
Hamed Hassani
Surbhi Goel
26
4
0
04 Jun 2024
Transformers Can Do Arithmetic with the Right Embeddings
Transformers Can Do Arithmetic with the Right Embeddings
Sean McLeish
Arpit Bansal
Alex Stein
Neel Jain
John Kirchenbauer
...
B. Kailkhura
A. Bhatele
Jonas Geiping
Avi Schwarzschild
Tom Goldstein
27
28
0
27 May 2024
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical Reasoning
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical Reasoning
Eli Schwartz
Leshem Choshen
J. Shtok
Sivan Doveh
Leonid Karlinsky
Assaf Arbelle
26
13
0
30 Mar 2024
The pitfalls of next-token prediction
The pitfalls of next-token prediction
Gregor Bachmann
Vaishnavh Nagarajan
22
57
0
11 Mar 2024
Machine learning for modular multiplication
Machine learning for modular multiplication
Kristin E. Lauter
C. Li
Krystal Maughan
Rachel Newton
Megha Srivastava
11
3
0
29 Feb 2024
Transformers Can Achieve Length Generalization But Not Robustly
Transformers Can Achieve Length Generalization But Not Robustly
Yongchao Zhou
Uri Alon
Xinyun Chen
Xuezhi Wang
Rishabh Agarwal
Denny Zhou
27
36
0
14 Feb 2024
Repeat After Me: Transformers are Better than State Space Models at
  Copying
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
92
77
0
01 Feb 2024
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving
  as Human Learners?
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
Andreas Opedal
Alessandro Stolfo
Haruki Shirakami
Ying Jiao
Ryan Cotterell
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
27
12
0
31 Jan 2024
On Context Utilization in Summarization with Large Language Models
On Context Utilization in Summarization with Large Language Models
Mathieu Ravaut
Aixin Sun
Nancy F. Chen
Shafiq R. Joty
23
13
0
16 Oct 2023
How does GPT-2 compute greater-than?: Interpreting mathematical
  abilities in a pre-trained language model
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
181
116
0
30 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
203
2,232
0
22 Mar 2023
Systematic Generalization and Emergent Structures in Transformers
  Trained on Structured Tasks
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Yuxuan Li
James L. McClelland
21
17
0
02 Oct 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
SHAPE: Shifted Absolute Position Embedding for Transformers
SHAPE: Shifted Absolute Position Embedding for Transformers
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
Kentaro Inui
223
44
0
13 Sep 2021
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
234
690
0
27 Aug 2021
1