ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.04901
  4. Cited By
Exploring Length Generalization in Large Language Models

Exploring Length Generalization in Large Language Models

11 July 2022
Cem Anil
Yuhuai Wu
Anders Andreassen
Aitor Lewkowycz
Vedant Misra
V. Ramasesh
Ambrose Slone
Guy Gur-Ari
Ethan Dyer
Behnam Neyshabur
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Exploring Length Generalization in Large Language Models"

26 / 26 papers shown
Title
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework
Can Large Language Models Learn Formal Logic? A Data-Driven Training and Evaluation Framework
Yuan Xia
Akanksha Atrey
Fadoua Khmaissia
Kedar S. Namjoshi
LRM
ELM
45
0
0
28 Apr 2025
Distributional Scaling Laws for Emergent Capabilities
Distributional Scaling Laws for Emergent Capabilities
Rosie Zhao
Tian Qin
David Alvarez-Melis
Sham Kakade
Naomi Saphra
LRM
37
0
0
24 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
54
1
0
17 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
119
0
0
04 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
73
4
0
03 Feb 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
80
4
0
31 Dec 2024
When Can Transformers Count to n?
When Can Transformers Count to n?
Gilad Yehudai
Haim Kaplan
Asma Ghandeharioun
Mor Geva
Amir Globerson
32
10
0
21 Jul 2024
Representing Rule-based Chatbots with Transformers
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
59
1
0
15 Jul 2024
The CLRS-Text Algorithmic Reasoning Language Benchmark
The CLRS-Text Algorithmic Reasoning Language Benchmark
Larisa Markeeva
Sean McLeish
Borja Ibarz
Wilfried Bounsi
Olga Kozlova
Alex Vitvitskyi
Charles Blundell
Tom Goldstein
Avi Schwarzschild
Petar Veličković
LRM
34
12
0
06 Jun 2024
Chain of Thoughtlessness? An Analysis of CoT in Planning
Chain of Thoughtlessness? An Analysis of CoT in Planning
Kaya Stechly
Karthik Valmeekam
Subbarao Kambhampati
LRM
LM&Ro
59
37
0
08 May 2024
MathWriting: A Dataset For Handwritten Mathematical Expression Recognition
MathWriting: A Dataset For Handwritten Mathematical Expression Recognition
Philippe Gervais
Asya Fadeeva
Andrii Maksai
23
4
0
16 Apr 2024
Positional Information Matters for Invariant In-Context Learning: A Case
  Study of Simple Function Classes
Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes
Yongqiang Chen
Binghui Xie
Kaiwen Zhou
Bo Han
Yatao Bian
James Cheng
27
2
0
30 Nov 2023
Adaptivity and Modularity for Efficient Generalization Over Task
  Complexity
Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Samira Abnar
Omid Saremi
Laurent Dinh
Shantel Wilson
Miguel Angel Bautista
...
Vimal Thilak
Etai Littwin
Jiatao Gu
Josh Susskind
Samy Bengio
22
5
0
13 Oct 2023
It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD
  Generalization for Generative Transformer Models
It Ain't That Bad: Understanding the Mysterious Performance Drop in OOD Generalization for Generative Transformer Models
Xingcheng Xu
Zihao Pan
Haipeng Zhang
Yanqing Yang
LRM
8
2
0
16 Aug 2023
Faith and Fate: Limits of Transformers on Compositionality
Faith and Fate: Limits of Transformers on Compositionality
Nouha Dziri
Ximing Lu
Melanie Sclar
Xiang Lorraine Li
Liwei Jian
...
Sean Welleck
Xiang Ren
Allyson Ettinger
Zaïd Harchaoui
Yejin Choi
ReLM
LRM
28
327
0
29 May 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
16
47
0
30 Jan 2023
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A Survey
Jie Huang
Kevin Chen-Chuan Chang
LM&MA
ELM
LRM
19
579
0
20 Dec 2022
Reasoning with Language Model Prompting: A Survey
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLM
ELM
LRM
49
307
0
19 Dec 2022
General-Purpose In-Context Learning by Meta-Learning Transformers
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
27
72
0
08 Dec 2022
Transformers Learn Shortcuts to Automata
Transformers Learn Shortcuts to Automata
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
OffRL
LRM
19
155
0
19 Oct 2022
Systematic Generalization and Emergent Structures in Transformers
  Trained on Structured Tasks
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Yuxuan Li
James L. McClelland
26
17
0
02 Oct 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the
  Computational Limit
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
25
123
0
18 Jul 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
Does Entity Abstraction Help Generative Transformers Reason?
Does Entity Abstraction Help Generative Transformers Reason?
Nicolas Angelard-Gontier
Siva Reddy
C. Pal
19
5
0
05 Jan 2022
Train Short, Test Long: Attention with Linear Biases Enables Input
  Length Extrapolation
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
245
695
0
27 Aug 2021
From Local Structures to Size Generalization in Graph Neural Networks
From Local Structures to Size Generalization in Graph Neural Networks
Gilad Yehudai
Ethan Fetaya
E. Meirom
Gal Chechik
Haggai Maron
GNN
AI4CE
142
123
0
17 Oct 2020
1