Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.06925
Cited By
Transformers Learn Low Sensitivity Functions: Investigations and Implications
11 March 2024
Bhavya Vasudeva
Deqing Fu
Tianyi Zhou
Elliott Kau
Youqi Huang
Vatsal Sharan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers Learn Low Sensitivity Functions: Investigations and Implications"
8 / 8 papers shown
Title
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari
Marco Mondelli
46
4
0
05 Feb 2024
Improving the Robustness of Transformer-based Large Language Models with Dynamic Attention
Lujia Shen
Yuwen Pu
Shouling Ji
Changjiang Li
Xuhong Zhang
Chunpeng Ge
Ting Wang
AAML
34
4
0
29 Nov 2023
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
212
123
0
30 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
218
515
0
01 Nov 2022
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
225
403
0
24 Jan 2022
Intriguing Properties of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
Fahad Shahbaz Khan
Ming-Hsuan Yang
ViT
265
626
0
21 May 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
310
2,896
0
15 Sep 2016
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
342
36,437
0
25 Aug 2016
1