ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.05442
  4. Cited By
Scaling Vision Transformers to 22 Billion Parameters

Scaling Vision Transformers to 22 Billion Parameters

10 February 2023
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
Justin Gilmer
Andreas Steiner
Mathilde Caron
Robert Geirhos
Ibrahim M. Alabdulmohsin
Rodolphe Jenatton
Lucas Beyer
Michael Tschannen
Anurag Arnab
Xiao Wang
C. Riquelme
Matthias Minderer
J. Puigcerver
Utku Evci
Manoj Kumar
Sjoerd van Steenkiste
Gamaleldin F. Elsayed
Aravindh Mahendran
F. I. F. Richard Yu
Avital Oliver
Fantine Huot
Jasmijn Bastings
Mark Collier
A. Gritsenko
Vighnesh Birodkar
C. N. Vasconcelos
Yi Tay
Thomas Mensink
Alexander Kolesnikov
Filip Pavetić
Dustin Tran
Thomas Kipf
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
    MLLM
ArXivPDFHTML

Papers citing "Scaling Vision Transformers to 22 Billion Parameters"

50 / 416 papers shown
Title
Distilling Inductive Bias: Knowledge Distillation Beyond Model
  Compression
Distilling Inductive Bias: Knowledge Distillation Beyond Model Compression
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
VLM
14
0
0
30 Sep 2023
Federated Learning with Differential Privacy for End-to-End Speech
  Recognition
Federated Learning with Differential Privacy for End-to-End Speech Recognition
Martin Pelikan
Sheikh Shams Azam
Vitaly Feldman
Jan Honza Silovsky
Kunal Talwar
Tatiana Likhomanenko
25
7
0
29 Sep 2023
GAIA-1: A Generative World Model for Autonomous Driving
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
13
215
0
29 Sep 2023
Intriguing properties of generative classifiers
Intriguing properties of generative classifiers
P. Jaini
Kevin Clark
Robert Geirhos
BDL
14
33
0
28 Sep 2023
Neural scaling laws for phenotypic drug discovery
Neural scaling laws for phenotypic drug discovery
Drew Linsley
John Griffin
Jason Parker Brown
Adam N Roose
Michael Frank
Peter Linsley
Steven Finkbeiner
Jeremy W. Linsley
25
0
0
28 Sep 2023
Masked Autoencoders are Scalable Learners of Cellular Morphology
Masked Autoencoders are Scalable Learners of Cellular Morphology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton A. Earnshaw
13
14
0
27 Sep 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and
  Favorable Transferability For ViTs
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
J. Han
Guiguang Ding
ViT
24
6
0
27 Sep 2023
Small-scale proxies for large-scale Transformer training instabilities
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
30
80
0
25 Sep 2023
Masked Image Residual Learning for Scaling Deeper Vision Transformers
Masked Image Residual Learning for Scaling Deeper Vision Transformers
Guoxi Huang
Hongtao Fu
A. Bors
19
7
0
25 Sep 2023
ClusterFormer: Clustering As A Universal Visual Learner
ClusterFormer: Clustering As A Universal Visual Learner
James Liang
Yiming Cui
Qifan Wang
Tong Geng
Wenguan Wang
Dongfang Liu
VLM
30
8
0
22 Sep 2023
Spatial-frequency channels, shape bias, and adversarial robustness
Spatial-frequency channels, shape bias, and adversarial robustness
Ajay Subramanian
E. Sizikova
N. Majaj
D. Pelli
AAML
18
22
0
22 Sep 2023
NoisyNN: Exploring the Influence of Information Entropy Change in
  Learning Systems
NoisyNN: Exploring the Influence of Information Entropy Change in Learning Systems
Xiao-Xing Yu
Zhe Huang
Yao Xue
Lu Zhang
Li Wang
Tianming Liu
Dajiang Zhu
11
6
0
19 Sep 2023
Replacing softmax with ReLU in Vision Transformers
Replacing softmax with ReLU in Vision Transformers
Mitchell Wortsman
Jaehoon Lee
Justin Gilmer
Simon Kornblith
ViT
22
29
0
15 Sep 2023
Scaling Laws for Sparsely-Connected Foundation Models
Scaling Laws for Sparsely-Connected Foundation Models
Elias Frantar
C. Riquelme
N. Houlsby
Dan Alistarh
Utku Evci
14
33
0
15 Sep 2023
Virchow: A Million-Slide Digital Pathology Foundation Model
Virchow: A Million-Slide Digital Pathology Foundation Model
Eugene Vorontsov
Alican Bozkurt
Adam Casson
George Shaikovski
Michal Zelechowski
...
Razik Yousfi
Christopher Kanan
David Klimstra
B. Rothrock
Thomas J. Fuchs
MedIm
11
81
0
14 Sep 2023
Towards Artificial General Intelligence (AGI) in the Internet of Things
  (IoT): Opportunities and Challenges
Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges
Fei Dou
Jin Ye
Geng Yuan
Qin Lu
Wei Niu
...
Hongyue Sun
Yunli Shao
Changying Li
Tianming Liu
Wenzhan Song
AI4CE
16
28
0
14 Sep 2023
Hydra: Multi-head Low-rank Adaptation for Parameter Efficient
  Fine-tuning
Hydra: Multi-head Low-rank Adaptation for Parameter Efficient Fine-tuning
Sanghyeon Kim
Hyunmo Yang
Younghyun Kim
Youngjoon Hong
Eunbyung Park
AI4CE
18
16
0
13 Sep 2023
Can you text what is happening? Integrating pre-trained language
  encoders into trajectory prediction models for autonomous driving
Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving
Ali Keysan
Andreas Look
Eitan Kosman
Gonca Gürsun
Jörg Wagner
Yu Yao
Barbara Rakitsch
22
29
0
11 Sep 2023
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
Shiji Song
Li Erran Li
Gao Huang
ViT
19
22
0
04 Sep 2023
MoMA: Momentum Contrastive Learning with Multi-head Attention-based
  Knowledge Distillation for Histopathology Image Analysis
MoMA: Momentum Contrastive Learning with Multi-head Attention-based Knowledge Distillation for Histopathology Image Analysis
T. Vuong
J. T. Kwak
28
6
0
31 Aug 2023
AtmoRep: A stochastic model of atmosphere dynamics using large scale
  representation learning
AtmoRep: A stochastic model of atmosphere dynamics using large scale representation learning
C. Lessig
Ilaria Luise
Bing Gong
M. Langguth
S. Stadtler
Martin G. Schultz
15
28
0
25 Aug 2023
Local Distortion Aware Efficient Transformer Adaptation for Image
  Quality Assessment
Local Distortion Aware Efficient Transformer Adaptation for Image Quality Assessment
Kangmin Xu
Liang Liao
Jing Xiao
Chaofeng Chen
Haoning Wu
Qiong Yan
Weisi Lin
ViT
13
5
0
23 Aug 2023
Unlocking Accuracy and Fairness in Differentially Private Image
  Classification
Unlocking Accuracy and Fairness in Differentially Private Image Classification
Leonard Berrada
Soham De
J. Shen
Jamie Hayes
Robert Stanforth
David Stutz
Pushmeet Kohli
Samuel L. Smith
Borja Balle
19
13
0
21 Aug 2023
Composable Function-preserving Expansions for Transformer Architectures
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
27
8
0
11 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
25
9
0
10 Aug 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
27
42
0
30 Jul 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng-Tao Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
22
1,074
0
28 Jul 2023
Sample Less, Learn More: Efficient Action Recognition via Frame Feature
  Restoration
Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration
Harry Cheng
Yangyang Guo
Liqiang Nie
Zhiyong Cheng
Mohan S. Kankanhalli
33
7
0
27 Jul 2023
Towards Generalist Biomedical AI
Towards Generalist Biomedical AI
Tao Tu
Shekoofeh Azizi
Danny Driess
M. Schaekermann
Mohamed Amin
...
Yossi Matias
K. Singhal
Peter R. Florence
Alan Karthikesalingam
Vivek Natarajan
LM&MA
MedIm
AI4MH
33
239
0
26 Jul 2023
Sparse Double Descent in Vision Transformers: real or phantom threat?
Sparse Double Descent in Vision Transformers: real or phantom threat?
Victor Quétu
Marta Milovanović
Enzo Tartaglione
11
2
0
26 Jul 2023
Prompting Large Language Models with Speech Recognition Abilities
Prompting Large Language Models with Speech Recognition Abilities
Yassir Fathullah
Chunyang Wu
Egor Lakomkin
J. Jia
Yuan Shangguan
...
Wenhan Xiong
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
M. Seltzer
AuLLM
19
124
0
21 Jul 2023
DVPT: Dynamic Visual Prompt Tuning of Large Pre-trained Models for
  Medical Image Analysis
DVPT: Dynamic Visual Prompt Tuning of Large Pre-trained Models for Medical Image Analysis
Along He
Kai Wang
Zhihong Wang
Tao Li
H. Fu
MedIm
17
2
0
19 Jul 2023
RepViT: Revisiting Mobile CNN From ViT Perspective
RepViT: Revisiting Mobile CNN From ViT Perspective
Ao Wang
Hui Chen
Zijia Lin
Hengjun Pu
Guiguang Ding
27
169
0
18 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
29
60
0
16 Jul 2023
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and
  Resolution
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution
Mostafa Dehghani
Basil Mustafa
Josip Djolonga
Jonathan Heek
Matthias Minderer
...
Avital Oliver
Piotr Padlewski
A. Gritsenko
Mario Luvcić
N. Houlsby
ViT
18
102
0
12 Jul 2023
Scale Alone Does not Improve Mechanistic Interpretability in Vision
  Models
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models
Roland S. Zimmermann
Thomas Klein
Wieland Brendel
10
13
0
11 Jul 2023
URL: A Representation Learning Benchmark for Transferable Uncertainty
  Estimates
URL: A Representation Learning Benchmark for Transferable Uncertainty Estimates
Michael Kirchhof
Bálint Mucsányi
Seong Joon Oh
Enkelejda Kasneci
UQCV
353
12
0
07 Jul 2023
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
33
15
0
07 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
35
149
0
05 Jul 2023
DoReMi: Grounding Language Model by Detecting and Recovering from
  Plan-Execution Misalignment
DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment
Yanjiang Guo
Yen-Jen Wang
Lihan Zha
Zheyuan Jiang
Jianyu Chen
LM&Ro
19
39
0
01 Jul 2023
Stitched ViTs are Flexible Vision Backbones
Stitched ViTs are Flexible Vision Backbones
Zizheng Pan
Jing Liu
Haoyu He
Jianfei Cai
Bohan Zhuang
11
2
0
30 Jun 2023
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width
  Limit
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Lorenzo Noci
Chuning Li
Mufan Bill Li
Bobby He
Thomas Hofmann
Chris J. Maddison
Daniel M. Roy
13
29
0
30 Jun 2023
End-to-end Autonomous Driving: Challenges and Frontiers
End-to-end Autonomous Driving: Challenges and Frontiers
Li Chen
Peng Wu
Kashyap Chitta
Bernhard Jaeger
Andreas Geiger
Hongyang Li
3DV
34
260
0
29 Jun 2023
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation
  based on Visual Foundation Model
RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model
Keyan Chen
Chenyang Liu
Hao Chen
Haotian Zhang
Wenyuan Li
Zhengxia Zou
Z. Shi
VLM
16
195
0
28 Jun 2023
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text
  Documents
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Hugo Laurenccon
Lucile Saulnier
Léo Tronchon
Stas Bekman
Amanpreet Singh
...
Siddharth Karamcheti
Alexander M. Rush
Douwe Kiela
Matthieu Cord
Victor Sanh
16
227
0
21 Jun 2023
EquiformerV2: Improved Equivariant Transformer for Scaling to
  Higher-Degree Representations
EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
Yidong Liao
Brandon M. Wood
Abhishek Das
Tess E. Smidt
11
128
0
21 Jun 2023
Pushing the Limits of 3D Shape Generation at Scale
Pushing the Limits of 3D Shape Generation at Scale
Wang Yu
Xuelin Qian
Jingyang Huo
Tiejun Huang
Bo-Lu Zhao
Yanwei Fu
21
11
0
20 Jun 2023
Scaling Open-Vocabulary Object Detection
Scaling Open-Vocabulary Object Detection
Matthias Minderer
A. Gritsenko
N. Houlsby
VLM
ObjD
24
172
0
16 Jun 2023
DreamSim: Learning New Dimensions of Human Visual Similarity using
  Synthetic Data
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data
Stephanie Fu
Netanel Y. Tamir
Shobhita Sundaram
Lucy Chai
Richard Y. Zhang
Tali Dekel
Phillip Isola
EGVM
18
95
0
15 Jun 2023
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Arnav Chavan
Zhuang Liu
D. K. Gupta
Eric P. Xing
Zhiqiang Shen
22
87
0
13 Jun 2023
Previous
123456789
Next