ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.05442
  4. Cited By
Scaling Vision Transformers to 22 Billion Parameters

Scaling Vision Transformers to 22 Billion Parameters

10 February 2023
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
Justin Gilmer
Andreas Steiner
Mathilde Caron
Robert Geirhos
Ibrahim M. Alabdulmohsin
Rodolphe Jenatton
Lucas Beyer
Michael Tschannen
Anurag Arnab
Xiao Wang
C. Riquelme
Matthias Minderer
J. Puigcerver
Utku Evci
Manoj Kumar
Sjoerd van Steenkiste
Gamaleldin F. Elsayed
Aravindh Mahendran
F. I. F. Richard Yu
Avital Oliver
Fantine Huot
Jasmijn Bastings
Mark Collier
A. Gritsenko
Vighnesh Birodkar
C. N. Vasconcelos
Yi Tay
Thomas Mensink
Alexander Kolesnikov
Filip Pavetić
Dustin Tran
Thomas Kipf
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
    MLLM
ArXivPDFHTML

Papers citing "Scaling Vision Transformers to 22 Billion Parameters"

50 / 416 papers shown
Title
Foundation Models in Robotics: Applications, Challenges, and the Future
Foundation Models in Robotics: Applications, Challenges, and the Future
Roya Firoozi
Johnathan Tucker
Stephen Tian
Anirudha Majumdar
Jiankai Sun
...
Brian Ichter
Danny Driess
Jiajun Wu
Cewu Lu
Mac Schwager
LM&Ro
AI4CE
LRM
VLM
35
136
0
13 Dec 2023
Remote Sensing Vision-Language Foundation Models without Annotations via
  Ground Remote Alignment
Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment
Utkarsh Mall
Cheng Perng Phoo
Meilin Kelsey Liu
Carl Vondrick
B. Hariharan
Kavita Bala
VLM
14
36
0
12 Dec 2023
Photorealistic Video Generation with Diffusion Models
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
39
172
0
11 Dec 2023
4M: Massively Multimodal Masked Modeling
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
34
62
0
11 Dec 2023
Structured Inverse-Free Natural Gradient: Memory-Efficient &
  Numerically-Stable KFAC
Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC
Wu Lin
Felix Dangel
Runa Eschenhagen
Kirill Neklyudov
Agustinus Kristiadi
Richard E. Turner
Alireza Makhzani
6
3
0
09 Dec 2023
Neither hype nor gloom do DNNs justice
Neither hype nor gloom do DNNs justice
Gaurav Malhotra
Christian Tsvetkov
B. D. Evans
19
115
0
08 Dec 2023
Adapting Vision Transformer for Efficient Change Detection
Adapting Vision Transformer for Efficient Change Detection
Yang Zhao
Yuxiang Zhang
Yanni Dong
Bo Du
VLM
20
2
0
08 Dec 2023
Scaling Laws of Synthetic Images for Model Training ... for Now
Scaling Laws of Synthetic Images for Model Training ... for Now
Lijie Fan
Kaifeng Chen
Dilip Krishnan
Dina Katabi
Phillip Isola
Yonglong Tian
CLIP
VLM
22
60
0
07 Dec 2023
GenTron: Diffusion Transformers for Image and Video Generation
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen
Mengmeng Xu
Jiawei Ren
Yuren Cong
Sen He
Yanping Xie
Animesh Sinha
Ping Luo
Tao Xiang
Juan-Manuel Perez-Rua
VGen
31
37
0
07 Dec 2023
SAMBA: A Trainable Segmentation Web-App with Smart Labelling
SAMBA: A Trainable Segmentation Web-App with Smart Labelling
Ronan Docherty
Isaac Squires
Antonis Vamvakeros
Samuel J. Cooper
10
4
0
07 Dec 2023
Open-sourced Data Ecosystem in Autonomous Driving: the Present and
  Future
Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future
Hongyang Li
Yang Li
Huijie Wang
Jia Zeng
Huilin Xu
...
Kai Yan
Beipeng Mu
Zhihui Peng
Shaoqing Ren
Yu Qiao
16
23
0
06 Dec 2023
Visual Program Distillation: Distilling Tools and Programmatic Reasoning
  into Vision-Language Models
Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models
Yushi Hu
Otilia Stretcu
Chun-Ta Lu
Krishnamurthy Viswanathan
Kenji Hata
Enming Luo
Ranjay Krishna
Ariel Fuxman
VLM
LRM
MLLM
32
28
0
05 Dec 2023
Rejuvenating image-GPT as Strong Visual Representation Learners
Rejuvenating image-GPT as Strong Visual Representation Learners
Sucheng Ren
Zeyu Wang
Hongru Zhu
Junfei Xiao
Alan L. Yuille
Cihang Xie
VLM
39
7
0
04 Dec 2023
Bootstrapping SparseFormers from Vision Foundation Models
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao
Zhan Tong
K. Lin
Joya Chen
Mike Zheng Shou
25
0
0
04 Dec 2023
Language-conditioned Detection Transformer
Language-conditioned Detection Transformer
Jang Hyun Cho
Philipp Krahenbuhl
VLM
ObjD
42
1
0
29 Nov 2023
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Leveraging VLM-Based Pipelines to Annotate 3D Objects
Rishabh Kabra
Loic Matthey
Alexander Lerchner
Niloy J. Mitra
10
6
0
29 Nov 2023
Federated Fine-Tuning of Foundation Models via Probabilistic Masking
Federated Fine-Tuning of Foundation Models via Probabilistic Masking
Vasileios Tsouvalas
Yuki M. Asano
Aaqib Saeed
FedML
79
3
0
29 Nov 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
39
1
0
29 Nov 2023
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
ViT
13
72
0
28 Nov 2023
ScribbleGen: Generative Data Augmentation Improves Scribble-supervised
  Semantic Segmentation
ScribbleGen: Generative Data Augmentation Improves Scribble-supervised Semantic Segmentation
Jacob Schnell
Jieke Wang
Lu Qi
Vincent Tao Hu
Meng Tang
DiffM
18
3
0
28 Nov 2023
Side4Video: Spatial-Temporal Side Network for Memory-Efficient
  Image-to-Video Transfer Learning
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
87
9
0
27 Nov 2023
An Empirical Investigation into Benchmarking Model Multiplicity for
  Trustworthy Machine Learning: A Case Study on Image Classification
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification
Prakhar Ganesh
23
5
0
24 Nov 2023
ADriver-I: A General World Model for Autonomous Driving
ADriver-I: A General World Model for Autonomous Driving
Fan Jia
Weixin Mao
Yingfei Liu
Yucheng Zhao
Yuqing Wen
Chi Zhang
Xiangyu Zhang
Tiancai Wang
22
63
0
22 Nov 2023
Applications of Large Scale Foundation Models for Autonomous Driving
Applications of Large Scale Foundation Models for Autonomous Driving
Yu Huang
Yue Chen
Zhu Li
ELM
AI4CE
LRM
ALM
LM&Ro
46
15
0
20 Nov 2023
Generalized Category Discovery in Semantic Segmentation
Generalized Category Discovery in Semantic Segmentation
Zhengyuan Peng
Qijian Tian
Jianqing Xu
Yizhang Jin
Xuequan Lu
Xin Tan
Yuan Xie
Lizhuang Ma
ISeg
12
2
0
20 Nov 2023
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy
Kirill Vishniakov
Zhiqiang Shen
Zhuang Liu
CLIP
12
15
0
15 Nov 2023
MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image
  Diagnosis
MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis
Yitao Zhu
Zhenrong Shen
Zihao Zhao
Sheng Wang
Xin Wang
Xiangyu Zhao
Dinggang Shen
Qian Wang
MedIm
32
28
0
14 Nov 2023
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs
Yassir Fathullah
Chunyang Wu
Egor Lakomkin
Ke Li
Junteng Jia
Shangguan Yuan
Jay Mahadeokar
Ozlem Kalinli
Christian Fuegen
Michael Seltzer
LM&MA
MLLM
AuLLM
14
33
0
12 Nov 2023
Harnessing Synthetic Datasets: The Role of Shape Bias in Deep Neural
  Network Generalization
Harnessing Synthetic Datasets: The Role of Shape Bias in Deep Neural Network Generalization
Elior Benarous
Sotiris Anagnostidis
Luca Biggio
Thomas Hofmann
14
3
0
10 Nov 2023
OtterHD: A High-Resolution Multi-modality Model
OtterHD: A High-Resolution Multi-modality Model
Bo-wen Li
Peiyuan Zhang
Jingkang Yang
Yuanhan Zhang
Fanyi Pu
Ziwei Liu
VLM
MLLM
30
65
0
07 Nov 2023
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Yaoxian Song
Penglei Sun
Haoyu Liu
Li Zhixu
Wei Song
Yanghua Xiao
Xiaofang Zhou
LM&Ro
51
12
0
07 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
23
2
0
06 Nov 2023
Large Language Models Illuminate a Progressive Pathway to Artificial
  Healthcare Assistant: A Review
Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review
Mingze Yuan
Peng Bao
Jiajia Yuan
Yunhao Shen
Zi Chen
...
Jie Zhao
Yang Chen
Li Zhang
Lin Shen
Bin Dong
ELM
LM&MA
41
13
0
03 Nov 2023
Simplifying Transformer Blocks
Simplifying Transformer Blocks
Bobby He
Thomas Hofmann
11
28
0
03 Nov 2023
Towards Calibrated Robust Fine-Tuning of Vision-Language Models
Towards Calibrated Robust Fine-Tuning of Vision-Language Models
Changdae Oh
Hyesu Lim
Mijoo Kim
Dongyoon Han
Junhyeok Park
Euiseog Jeong
Alexander G. Hauptmann
Zhi-Qi Cheng
Kyungwoo Song
VLM
16
13
0
03 Nov 2023
RTP: Rethinking Tensor Parallelism with Memory Deduplication
RTP: Rethinking Tensor Parallelism with Memory Deduplication
Cheng Luo
Tianle Zhong
Geoffrey C. Fox
19
3
0
02 Nov 2023
Distilling Out-of-Distribution Robustness from Vision-Language
  Foundation Models
Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models
Andy Zhou
Jindong Wang
Yu-xiong Wang
Haohan Wang
VLM
33
6
0
02 Nov 2023
Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner
  from Backbone
Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Zeyinzi Jiang
Chaojie Mao
Ziyuan Huang
Ao Ma
Yiliang Lv
Yujun Shen
Deli Zhao
Jingren Zhou
17
15
0
30 Oct 2023
On consequences of finetuning on data with highly discriminative
  features
On consequences of finetuning on data with highly discriminative features
Wojciech Masarczyk
Tomasz Trzciñski
M. Ostaszewski
17
0
0
30 Oct 2023
Emergence of Shape Bias in Convolutional Neural Networks through
  Activation Sparsity
Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity
Tianqin Li
Ziqi Wen
Yangfan Li
Tai Sing Lee
11
10
0
29 Oct 2023
Socially Cognizant Robotics for a Technology Enhanced Society
Socially Cognizant Robotics for a Technology Enhanced Society
Kristin J. Dana
Clinton Andrews
Kostas Bekris
Jacob Feldman
Matthew Stone
Pernille Hemmer
Aaron Mazzeo
Hal Salzman
Jingang Yi
11
0
0
27 Oct 2023
A Unified, Scalable Framework for Neural Population Decoding
A Unified, Scalable Framework for Neural Population Decoding
Mehdi Azabou
Vinam Arora
Venkataramana Ganesh
Ximeng Mao
Santosh Nachimuthu
Michael J. Mendelson
Blake A. Richards
M. Perich
Guillaume Lajoie
Eva L. Dyer
HAI
AI4TS
19
35
0
24 Oct 2023
Extending Input Contexts of Language Models through Training on
  Segmented Sequences
Extending Input Contexts of Language Models through Training on Segmented Sequences
Petros Karypis
Julian McAuley
George Karypis
22
0
0
23 Oct 2023
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL
  Shader Images
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images
Logan Frank
Jim Davis
20
1
0
20 Oct 2023
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced
  Optimization Problems
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
David T. Hoffmann
Simon Schrodi
Jelena Bratulić
Nadine Behrmann
Volker Fischer
Thomas Brox
19
5
0
19 Oct 2023
Functional Invariants to Watermark Large Transformers
Functional Invariants to Watermark Large Transformers
Pierre Fernandez
Guillaume Couairon
Teddy Furon
Matthijs Douze
6
8
0
17 Oct 2023
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Xi Chen
Xiao Wang
Lucas Beyer
Alexander Kolesnikov
Jialin Wu
...
Keran Rong
Tianli Yu
Daniel Keysers
Xiao-Qi Zhai
Radu Soricut
MLLM
VLM
28
92
0
13 Oct 2023
MatFormer: Nested Transformer for Elastic Inference
MatFormer: Nested Transformer for Elastic Inference
Devvrit
Sneha Kudugunta
Aditya Kusupati
Tim Dettmers
Kaifeng Chen
...
Yulia Tsvetkov
Hannaneh Hajishirzi
Sham Kakade
Ali Farhadi
Prateek Jain
26
22
0
11 Oct 2023
Multiple Physics Pretraining for Physical Surrogate Models
Multiple Physics Pretraining for Physical Surrogate Models
Michael McCabe
Bruno Régaldo-Saint Blancard
Liam Parker
Ruben Ohana
M. Cranmer
...
Francois Lanusse
Mariel Pettee
Tiberiu Teşileanu
Kyunghyun Cho
Shirley Ho
PINN
AI4CE
18
50
0
04 Oct 2023
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Vincent Leroy
Jérôme Revaud
Thomas Lucas
Philippe Weinzaepfel
ViT
27
2
0
01 Oct 2023
Previous
123456789
Next