Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.09221
Cited By
Vision Transformers provably learn spatial structure
13 October 2022
Samy Jelassi
Michael E. Sander
Yuan-Fang Li
ViT
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision Transformers provably learn spatial structure"
21 / 21 papers shown
Title
Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights
Zhaiming Shen
Alex Havrilla
Rongjie Lai
A. Cloninger
Wenjing Liao
39
0
0
06 May 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
MoMe
60
2
0
15 Apr 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
59
0
0
11 Apr 2025
PViT: Prior-augmented Vision Transformer for Out-of-distribution Detection
Tianhao Zhang
Zhixiang Chen
Lyudmila Mihaylova
83
0
0
27 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
83
19
0
15 Oct 2024
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Binghui Li
Yuanzhi Li
OOD
28
2
0
11 Oct 2024
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
90
2
0
02 Oct 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
24
5
0
29 May 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
Lorenzo Tiberi
Francesca Mignacco
Kazuki Irie
H. Sompolinsky
42
6
0
24 May 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
24
13
0
08 Feb 2024
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
22
18
0
28 Jan 2024
Memorization Capacity of Multi-Head Attention in Transformers
Sadegh Mahdavi
Renjie Liao
Christos Thrampoulidis
22
22
0
03 Jun 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
35
56
0
12 Feb 2023
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
214
400
0
24 Jan 2022
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
337
500
0
13 Jul 2021
Intriguing Properties of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
251
619
0
21 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
303
5,761
0
29 Apr 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
260
179
0
17 Feb 2021
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
Yuhuai Wu
M. Rabe
Wenda Li
Jimmy Ba
Roger C. Grosse
Christian Szegedy
AIMat
LRM
61
51
0
15 Jan 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,886
0
15 Sep 2016
1