ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.12316
  4. Cited By
Simplicity Bias in Transformers and their Ability to Learn Sparse
  Boolean Functions

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

22 November 2022
S. Bhattamishra
Arkil Patel
Varun Kanade
Phil Blunsom
ArXivPDFHTML

Papers citing "Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions"

10 / 10 papers shown
Title
A distributional simplicity bias in the learning dynamics of transformers
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
68
8
0
17 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
96
0
0
04 Feb 2025
Training Neural Networks as Recognizers of Formal Languages
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Ryan Cotterell
Brian DuSell
NAI
33
2
0
11 Nov 2024
Neural Redshift: Random Networks are not Random Functions
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
89
18
0
04 Mar 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
34
1
0
01 Feb 2024
Simplicity bias, algorithmic probability, and the random logistic map
Simplicity bias, algorithmic probability, and the random logistic map
B. Hamzi
K. Dingle
23
3
0
31 Dec 2023
Do deep neural networks have an inbuilt Occam's razor?
Do deep neural networks have an inbuilt Occam's razor?
Chris Mingard
Henry Rees
Guillermo Valle Pérez
A. Louis
UQCV
BDL
19
15
0
13 Apr 2023
Neural Networks and the Chomsky Hierarchy
Neural Networks and the Chomsky Hierarchy
Grégoire Delétang
Anian Ruoss
Jordi Grau-Moya
Tim Genewein
L. Wenliang
...
Chris Cundy
Marcus Hutter
Shane Legg
Joel Veness
Pedro A. Ortega
UQCV
94
129
0
05 Jul 2022
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Michael Hahn
Dan Jurafsky
Richard Futrell
143
22
0
21 Apr 2021
Memorisation versus Generalisation in Pre-trained Language Models
Memorisation versus Generalisation in Pre-trained Language Models
Michael Tänzer
Sebastian Ruder
Marek Rei
84
50
0
16 Apr 2021
1