Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.09238
Cited By
Formal Algorithms for Transformers
19 July 2022
Mary Phuong
Marcus Hutter
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Formal Algorithms for Transformers"
44 / 44 papers shown
Title
Dual Filter: A Mathematical Framework for Inference using Transformer-like Architectures
Heng-Sheng Chang
P. Mehta
34
0
0
01 May 2025
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
73
1
0
25 Apr 2025
Emergence of Computational Structure in a Neural Network Physics Simulator
Rohan Hitchcock
Gary W. Delaney
J. Manton
Richard Scalzo
Jingge Zhu
22
0
0
16 Apr 2025
Scaling Embedding Layers in Language Models
Da Yu
Edith Cohen
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Daogao Liu
Chiyuan Zhang
72
0
0
03 Feb 2025
Notes on the Mathematical Structure of GPT LLM Architectures
Spencer Becker-Kahn
24
1
0
25 Oct 2024
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
19
0
0
07 Oct 2024
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
42
2
0
02 Oct 2024
Variational Search Distributions
Daniel M. Steinberg
Rafael Oliveira
Cheng Soon Ong
Edwin V. Bonilla
30
0
0
10 Sep 2024
Learning Randomized Algorithms with Transformers
J. Oswald
Seijin Kobayashi
Yassir Akram
Angelika Steger
AAML
27
0
0
20 Aug 2024
Increasing transformer token length with a Maximum Entropy Principle Method
R. I. Cukier
16
1
0
17 Aug 2024
Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference
R. Prabhakar
Hengrui Zhang
D. Wentzlaff
23
0
0
14 Aug 2024
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
Mamba
19
0
0
04 Aug 2024
Wavelets Are All You Need for Autoregressive Image Generation
Wael Mattar
Idan Levy
Nir Sharon
S. Dekel
22
3
0
28 Jun 2024
Opportunities in deep learning methods development for computational biology
Alex J. Lee
Reza Abbasi-Asl
AI4CE
22
0
0
12 Jun 2024
Improving Transformers using Faithful Positional Encoding
Tsuyoshi Idé
Jokin Labaien
Pin-Yu Chen
24
0
0
15 May 2024
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi
Akshat Shrivastava
Diana Liskovich
Basil Hosmer
Bram Wasti
...
Saurabh Agarwal
Ahmed Roman
Ahmed Aly
Beidi Chen
Carole-Jean Wu
LRM
25
82
0
25 Apr 2024
M&M: Multimodal-Multitask Model Integrating Audiovisual Cues in Cognitive Load Assessment
Long Nguyen-Phuoc
Rénald Gaboriau
Dimitri Delacroix
Laurent Navarro
16
0
0
14 Mar 2024
Materials science in the era of large language models: a perspective
Ge Lei
Ronan Docherty
Samuel J. Cooper
35
3
0
11 Mar 2024
Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures
Vincent Abbott
40
4
0
08 Feb 2024
Evolving Code with A Large Language Model
Erik Hemberg
Stephen Moskal
Una-May O’Reilly
20
24
0
13 Jan 2024
Can a Transformer Represent a Kalman Filter?
Gautam Goel
Peter L. Bartlett
16
11
0
12 Dec 2023
Natural Language Processing for Financial Regulation
I. Achitouv
Dragos Gorduza
Antoine Jacquier
AILaw
13
2
0
14 Nov 2023
What Formal Languages Can Transformers Express? A Survey
Lena Strobl
William Merrill
Gail Weiss
David Chiang
Dana Angluin
AI4CE
14
45
0
01 Nov 2023
Hierarchical Attention and Graph Neural Networks: Toward Drift-Free Pose Estimation
Kathia Melbouci
F. Nashashibi
14
0
0
18 Sep 2023
Large Language Models Vote: Prompting for Rare Disease Identification
David Oniani
Jordan Hilsman
Hang Dong
F. Gao
Shiven Verma
Yanshan Wang
19
12
0
24 Aug 2023
Enhancing Object Detection in Ancient Documents with Synthetic Data Generation and Transformer-Based Models
Zahra Ziran
F. Leotta
Massimo Mecella
13
1
0
29 Jul 2023
The Hydra Effect: Emergent Self-repair in Language Model Computations
Tom McGrath
Matthew Rahtz
János Kramár
Vladimir Mikulik
Shane Legg
MILM
LRM
11
68
0
28 Jul 2023
Large Language Models
Michael R Douglas
LLMAG
LM&MA
22
547
0
11 Jul 2023
Hierarchical Neural Simulation-Based Inference Over Event Ensembles
Lukas Heinrich
S. Mishra-Sharma
C. Pollard
Philipp Windischhofer
8
4
0
21 Jun 2023
RankFormer: Listwise Learning-to-Rank Using Listwide Labels
Maarten Buyl
Paul Missault
Pierre-Antoine Sondag
OffRL
9
6
0
09 Jun 2023
Agents Explore the Environment Beyond Good Actions to Improve Their Model for Better Decisions
Matthias Unverzagt
LLMAG
17
0
0
06 Jun 2023
White-Box Transformers via Sparse Rate Reduction
Yaodong Yu
Sam Buchanan
Druv Pai
Tianzhe Chu
Ziyang Wu
Shengbang Tong
B. Haeffele
Y. Ma
ViT
16
80
0
01 Jun 2023
Exposing Attention Glitches with Flip-Flop Language Modeling
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
LRM
21
46
0
01 Jun 2023
Explainability Techniques for Chemical Language Models
Stefan Hödl
William Robinson
Yoram Bachrach
Wilhelm Huck
Tal Kachman
17
4
0
25 May 2023
Can Transformers Learn to Solve Problems Recursively?
Shizhuo Zhang
Curt Tigges
Stella Biderman
Maxim Raginsky
Talia Ringer
15
13
0
24 May 2023
Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism
Xin Chen
Hengheng Zhang
Xiaotao Gu
Kaifeng Bi
Lingxi Xie
Qi Tian
MoE
14
4
0
22 Apr 2023
An Introduction to Transformers
Richard E. Turner
ViT
18
0
0
20 Apr 2023
Evaluating self-attention interpretability through human-grounded experimental protocol
Milan Bhan
Nina Achache
Victor Legrand
A. Blangero
N. Chesneau
14
9
0
27 Mar 2023
Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies
Daniel Lawson
A. H. Qureshi
MoMe
OffRL
6
13
0
14 Mar 2023
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
60
0
0
18 Feb 2023
Coinductive guide to inductive transformer heads
Adam Nemecek
12
0
0
03 Feb 2023
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
Xingcheng Xu
12
0
0
16 Nov 2022
Perceived personality state estimation in dyadic and small group interaction with deep learning methods
Kristian Fenech
Ádám Fodor
Sean P. Bergeron
R. R. Saboundji
Catharine Oertel
András Lőrincz
17
0
0
09 Nov 2022
TrAISformer -- A Transformer Network with Sparse Augmented Data Representation and Cross Entropy Loss for AIS-based Vessel Trajectory Prediction
Duong Nguyen
Ronan Fablet
19
24
0
08 Sep 2021
1