ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.10743
  4. Cited By
Tighter Bounds on the Expressivity of Transformer Encoders

Tighter Bounds on the Expressivity of Transformer Encoders

25 January 2023
David Chiang
Peter A. Cholak
A. Pillay
ArXivPDFHTML

Papers citing "Tighter Bounds on the Expressivity of Transformer Encoders"

36 / 36 papers shown
Title
Unique Hard Attention: A Tale of Two Sides
Unique Hard Attention: A Tale of Two Sides
Selim Jerad
Anej Svete
Jiaoda Li
Ryan Cotterell
54
0
0
18 Mar 2025
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
William Merrill
Ashish Sabharwal
53
4
0
05 Mar 2025
Ask, and it shall be given: On the Turing completeness of prompting
Ask, and it shall be given: On the Turing completeness of prompting
Ruizhong Qiu
Zhe Xu
W. Bao
Hanghang Tong
ReLM
LRM
AI4CE
62
0
0
24 Feb 2025
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Zhenheng Tang
Xiang Liu
Qian Wang
Peijie Dong
Bingsheng He
Xiaowen Chu
Bo Li
LRM
50
1
0
24 Feb 2025
Ehrenfeucht-Haussler Rank and Chain of Thought
Ehrenfeucht-Haussler Rank and Chain of Thought
Pablo Barceló
A. Kozachinskiy
Tomasz Steifer
LRM
71
1
0
22 Jan 2025
How Numerical Precision Affects Mathematical Reasoning Capabilities of
  LLMs
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
Z. Li
Liwei Wang
LRM
30
5
0
17 Oct 2024
Learning Linear Attention in Polynomial Time
Learning Linear Attention in Polynomial Time
Morris Yau
Ekin Akyürek
Jiayuan Mao
Joshua B. Tenenbaum
Stefanie Jegelka
Jacob Andreas
17
2
0
14 Oct 2024
A mechanistically interpretable neural network for regulatory genomics
A mechanistically interpretable neural network for regulatory genomics
Alex Tseng
Gökçen Eraslan
Tommaso Biancalani
Gabriele Scalia
14
0
0
08 Oct 2024
GUNDAM: Aligning Large Language Models with Graph Understanding
GUNDAM: Aligning Large Language Models with Graph Understanding
Sheng Ouyang
Yulan Hu
Ge Chen
Yong Liu
AI4CE
18
3
0
30 Sep 2024
Transformers in Uniform TC$^0$
Transformers in Uniform TC0^00
David Chiang
23
3
0
20 Sep 2024
Transformers are Universal In-context Learners
Transformers are Universal In-context Learners
Takashi Furuya
Maarten V. de Hoop
Gabriel Peyré
24
6
0
02 Aug 2024
Separations in the Representational Capabilities of Transformers and
  Recurrent Architectures
Separations in the Representational Capabilities of Transformers and Recurrent Architectures
S. Bhattamishra
Michael Hahn
Phil Blunsom
Varun Kanade
GNN
28
8
0
13 Jun 2024
How Out-of-Distribution Detection Learning Theory Enhances Transformer: Learnability and Reliability
How Out-of-Distribution Detection Learning Theory Enhances Transformer: Learnability and Reliability
Yijin Zhou
Yuguang Wang
Xiaowen Dong
Yuguang Wang
30
0
0
13 Jun 2024
The Expressive Capacity of State Space Models: A Formal Language
  Perspective
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
30
7
0
27 May 2024
A Transformer with Stack Attention
A Transformer with Stack Attention
Jiaoda Li
Jennifer C. White
Mrinmaya Sachan
Ryan Cotterell
22
2
0
07 May 2024
Transformers Can Represent $n$-gram Language Models
Transformers Can Represent nnn-gram Language Models
Anej Svete
Ryan Cotterell
32
17
0
23 Apr 2024
The Illusion of State in State-Space Models
The Illusion of State in State-Space Models
William Merrill
Jackson Petty
Ashish Sabharwal
46
43
0
12 Apr 2024
Counting Like Transformers: Compiling Temporal Counting Logic Into
  Softmax Transformers
Counting Like Transformers: Compiling Temporal Counting Logic Into Softmax Transformers
Andy Yang
David Chiang
31
7
0
05 Apr 2024
Simulating Weighted Automata over Sequences and Trees with Transformers
Simulating Weighted Automata over Sequences and Trees with Transformers
Michael Rizvi
M. Lizaire
Clara Lacroce
Guillaume Rabusseau
AI4CE
37
0
0
12 Mar 2024
Transformers are Expressive, But Are They Expressive Enough for
  Regression?
Transformers are Expressive, But Are They Expressive Enough for Regression?
Swaroop Nath
H. Khadilkar
Pushpak Bhattacharyya
23
3
0
23 Feb 2024
Chain of Thought Empowers Transformers to Solve Inherently Serial
  Problems
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Zhiyuan Li
Hong Liu
Denny Zhou
Tengyu Ma
LRM
AI4CE
20
94
0
20 Feb 2024
Strong hallucinations from negation and how to fix them
Strong hallucinations from negation and how to fix them
Nicholas Asher
Swarnadeep Bhar
ReLM
LRM
16
3
0
16 Feb 2024
Why are Sensitive Functions Hard for Transformers?
Why are Sensitive Functions Hard for Transformers?
Michael Hahn
Mark Rofin
20
22
0
15 Feb 2024
An Examination on the Effectiveness of Divide-and-Conquer Prompting in
  Large Language Models
An Examination on the Effectiveness of Divide-and-Conquer Prompting in Large Language Models
Yizhou Zhang
Lun Du
Defu Cao
Qiang Fu
Yan Liu
LRM
20
7
0
08 Feb 2024
Repeat After Me: Transformers are Better than State Space Models at
  Copying
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
95
77
0
01 Feb 2024
On The Expressivity of Recurrent Neural Cascades
On The Expressivity of Recurrent Neural Cascades
Nadezda A. Knorozova
Alessandro Ronca
18
1
0
14 Dec 2023
What Formal Languages Can Transformers Express? A Survey
What Formal Languages Can Transformers Express? A Survey
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
14
45
0
01 Nov 2023
Masked Hard-Attention Transformers Recognize Exactly the Star-Free
  Languages
Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages
Andy Yang
David Chiang
Dana Angluin
28
14
0
21 Oct 2023
Do pretrained Transformers Learn In-Context by Gradient Descent?
Do pretrained Transformers Learn In-Context by Gradient Descent?
Lingfeng Shen
Aayush Mishra
Daniel Khashabi
17
7
0
12 Oct 2023
The Expressive Power of Transformers with Chain of Thought
The Expressive Power of Transformers with Chain of Thought
William Merrill
Ashish Sabharwal
LRM
AI4CE
ReLM
19
41
0
11 Oct 2023
Logical Languages Accepted by Transformer Encoders with Hard Attention
Logical Languages Accepted by Transformer Encoders with Hard Attention
Pablo Barceló
A. Kozachinskiy
A. W. Lin
Vladimir Podolskii
17
15
0
05 Oct 2023
Evaluating Transformer's Ability to Learn Mildly Context-Sensitive
  Languages
Evaluating Transformer's Ability to Learn Mildly Context-Sensitive Languages
Shunjie Wang
Shane Steinert-Threlkeld
17
4
0
02 Sep 2023
Large Language Models
Large Language Models
Michael R Douglas
LLMAG
LM&MA
22
547
0
11 Jul 2023
Faith and Fate: Limits of Transformers on Compositionality
Faith and Fate: Limits of Transformers on Compositionality
Nouha Dziri
Ximing Lu
Melanie Sclar
Xiang Lorraine Li
Liwei Jian
...
Sean Welleck
Xiang Ren
Allyson Ettinger
Zaïd Harchaoui
Yejin Choi
ReLM
LRM
28
324
0
29 May 2023
A Logic for Expressing Log-Precision Transformers
A Logic for Expressing Log-Precision Transformers
William Merrill
Ashish Sabharwal
ReLM
NAI
LRM
48
46
0
06 Oct 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
1