Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.02411
Cited By
NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function
4 March 2024
Abdullah Nazhat Abdullah
Tarkan Aydin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function"
7 / 7 papers shown
Title
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
53
117
0
29 Feb 2024
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
C. Hegde
63
107
0
11 Sep 2022
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
214
400
0
24 Jan 2022
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,592
0
04 May 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,518
0
27 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
225
2,427
0
04 Jan 2021
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
1