Steerable Transformers

24 May 2024

Soumyabrata Kundu

Risi Kondor

LLMSV

ViT

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (488★)

Main:9 Pages

6 Figures

Bibliography:3 Pages

4 Tables

Appendix:9 Pages

Abstract

In this work we introduce Steerable Transformers, an extension of the Vision Transformer mechanism that maintains equivariance to the special Euclidean group $\mathrm{SE}(d)$ . We propose an equivariant attention mechanism that operates on features extracted by steerable convolutions. Operating in Fourier space, our network utilizes Fourier space non-linearities. Our experiments in both two and three dimensions show that adding a steerable transformer encoder layer to a steerable convolution network enhances performance.

View on arXiv

Comments on this paper